summaryrefslogtreecommitdiff
path: root/Content/posts
diff options
context:
space:
mode:
authornavanchauhan <navanchauhan@gmail.com>2023-02-08 17:40:27 -0700
committernavanchauhan <navanchauhan@gmail.com>2023-02-08 17:40:27 -0700
commite3c2fac4f49859268d2f337ecaa64c41e3a6bd1d (patch)
treed336786cbcd3f8d39f4914b734741378126d22df /Content/posts
parent0d34e3633ba36edd049ce3ba52908e061b7273d5 (diff)
new post
Diffstat (limited to 'Content/posts')
-rw-r--r--Content/posts/2023-02-08-Interact-with-siri-from-the-terminal.md229
1 files changed, 229 insertions, 0 deletions
diff --git a/Content/posts/2023-02-08-Interact-with-siri-from-the-terminal.md b/Content/posts/2023-02-08-Interact-with-siri-from-the-terminal.md
new file mode 100644
index 0000000..78de77c
--- /dev/null
+++ b/Content/posts/2023-02-08-Interact-with-siri-from-the-terminal.md
@@ -0,0 +1,229 @@
+---
+date: 2023-02-08 17:21
+description: Code snippet to interact with Siri by issuing commands from the command-line.
+tags: Tutorial, Code-Snippet, Python, Siri, macOS, AppleScript
+---
+
+# Interacting with Siri using the command line
+
+My main objective was to see if I could issue multi-intent commands in one go. Obviously, Siri cannot do that (neither can Alexa, Cortana, or Google Assistant). The script here can issue either a single command, or use the help of OpenAI's DaVinci model to extract multiple commands and pass them onto siri.
+
+## Prerequisites
+
+* Run macOS
+* Enable type to Siri (Settings > Accessibility -> Type to Siri)
+* Enable the Terminal to control System Events (The first time you run the script, it will prompt you to enable it)
+
+## Show me ze code
+
+If you are here just for the code:
+
+```python
+import argparse
+import applescript
+import openai
+
+from os import getenv
+
+openai.api_key = getenv("OPENAI_KEY")
+engine = "text-davinci-003"
+
+def execute_with_llm(command_text: str) -> None:
+ llm_prompt = f"""You are provided with multiple commands as a single command. Break down all the commands and return them in a list of strings. If you are provided with a single command, return a list with a single string, trying your best to understand the command.
+
+ Example:
+ Q: "Turn on the lights and turn off the lights"
+ A: ["Turn on the lights", "Turn off the lights"]
+
+ Q: "Switch off the lights and then play some music"
+ A: ["Switch off the lights", "Play some music"]
+
+ Q: "I am feeling sad today, play some music"
+ A: ["Play some cheerful music"]
+
+ Q: "{command_text}"
+ A:
+ """
+
+ completion = openai.Completion.create(engine=engine, prompt=llm_prompt, max_tokens=len(command_text.split(" "))*2)
+
+ for task in eval(completion.choices[0].text):
+ execute_command(task)
+
+
+def execute_command(command_text: str) -> None:
+ """Execute a Siri command."""
+
+ script = applescript.AppleScript(f"""
+ tell application "System Events" to tell the front menu bar of process "SystemUIServer"
+ tell (first menu bar item whose description is "Siri")
+ perform action "AXPress"
+ end tell
+ end tell
+
+ delay 2
+
+ tell application "System Events"
+ set textToType to "{command_text}"
+ keystroke textToType
+ key code 36
+ end tell
+ """)
+
+ script.run()
+
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser()
+ parser.add_argument("command", nargs="?", type=str, help="The command to pass to Siri", default="What time is it?")
+ parser.add_argument('--openai', action=argparse.BooleanOptionalAction, help="Use OpenAI to detect multiple intents", default=False)
+ args = parser.parse_args()
+
+ if args.openai:
+ execute_with_llm(args.command)
+ else:
+ execute_command(args.command)
+```
+
+Usage:
+
+```bash
+python3 main.py "play some taylor swift"
+python3 main.py "turn off the lights and play some music" --openai
+```
+
+## ELI5
+
+I am not actually going to explain it as if I am explaining to a five-year old kid.
+
+### AppleScript
+
+In the age of Siri Shortcuts, AppleScript can still do more. It is a scripting language created by Apple that can help you automate pretty much anything you see on your screen.
+
+We use the following AppleScript to trigger Siri and then type in our command:
+
+```applescript
+tell application "System Events" to tell the front menu bar of process "SystemUIServer"
+ tell (first menu bar item whose description is "Siri")
+ perform action "AXPress"
+ end tell
+end tell
+
+delay 2
+
+tell application "System Events"
+ set textToType to "Play some rock music"
+ keystroke textToType
+ key code 36
+end tell
+```
+
+This first triggers Siri, waits for a couple of seconds, and then types in our command. In the script, this functionality is handled by the `execute_command` function.
+
+```python
+import applescript
+
+def execute_command(command_text: str) -> None:
+ """Execute a Siri command."""
+
+ script = applescript.AppleScript(f"""
+ tell application "System Events" to tell the front menu bar of process "SystemUIServer"
+ tell (first menu bar item whose description is "Siri")
+ perform action "AXPress"
+ end tell
+ end tell
+
+ delay 2
+
+ tell application "System Events"
+ set textToType to "{command_text}"
+ keystroke textToType
+ key code 36
+ end tell
+ """)
+
+ script.run()
+```
+
+### Multi-Intent Commands
+
+We can call OpenAI's API to autocomplete our prompt and extract multiple commands. We don't need to use OpenAI's API, and can also simply use Google's Flan-T5 model using HuggingFace's transformers library.
+
+#### Ze Prompt
+
+```text
+You are provided with multiple commands as a single command. Break down all the commands and return them in a list of strings. If you are provided with a single command, return a list with a single string, trying your best to understand the command.
+
+ Example:
+ Q: "Turn on the lights and turn off the lights"
+ A: ["Turn on the lights", "Turn off the lights"]
+
+ Q: "Switch off the lights and then play some music"
+ A: ["Switch off the lights", "Play some music"]
+
+ Q: "I am feeling sad today, play some music"
+ A: ["Play some cheerful music"]
+
+ Q: "{command_text}"
+ A:
+```
+
+This prompt gives the model a few examples to increase the generation accuracy, along with instructing it to return a Python list.
+
+
+#### Ze Code
+
+```python
+import openai
+
+from os import getenv
+
+openai.api_key = getenv("OPENAI_KEY")
+engine = "text-davinci-003"
+
+def execute_with_llm(command_text: str) -> None:
+ llm_prompt = f"""You are provided with multiple commands as a single command. Break down all the commands and return them in a list of strings. If you are provided with a single command, return a list with a single string, trying your best to understand the command.
+
+ Example:
+ Q: "Turn on the lights and turn off the lights"
+ A: ["Turn on the lights", "Turn off the lights"]
+
+ Q: "Switch off the lights and then play some music"
+ A: ["Switch off the lights", "Play some music"]
+
+ Q: "I am feeling sad today, play some music"
+ A: ["Play some cheerful music"]
+
+ Q: "{command_text}"
+ A:
+ """
+
+ completion = openai.Completion.create(engine=engine, prompt=llm_prompt, max_tokens=len(command_text.split(" "))*2)
+
+ for task in eval(completion.choices[0].text): # NEVER EVAL IN PROD RIGHT LIKE THIS
+ execute_command(task)
+```
+
+
+### Gluing together code
+
+To finish it all off, we can use argparse to only send the input command to OpenAI when asked to do so.
+
+```python
+import argparse
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser()
+ parser.add_argument("command", nargs="?", type=str, help="The command to pass to Siri", default="What time is it?")
+ parser.add_argument('--openai', action=argparse.BooleanOptionalAction, help="Use OpenAI to detect multiple intents", default=False)
+ args = parser.parse_args()
+
+ if args.openai:
+ execute_with_llm(args.command)
+ else:
+ execute_command(args.command)
+```
+
+## Conclusion
+
+Siri is still dumb. When I ask it to `Switch off the lights`, it default to the home thousands of miles away. But, this code snippet definitely does work! \ No newline at end of file