new post

author: navanchauhan <navanchauhan@gmail.com> 2023-02-08 17:40:27 -0700
committer: navanchauhan <navanchauhan@gmail.com> 2023-02-08 17:40:27 -0700
commit: e3c2fac4f49859268d2f337ecaa64c41e3a6bd1d (patch)
tree: d336786cbcd3f8d39f4914b734741378126d22df /Content/posts
parent: 0d34e3633ba36edd049ce3ba52908e061b7273d5 (diff)
1 files changed, 229 insertions, 0 deletions
diff --git a/Content/posts/2023-02-08-Interact-with-siri-from-the-terminal.md b/Content/posts/2023-02-08-Interact-with-siri-from-the-terminal.md
new file mode 100644
index 0000000..78de77c
--- /dev/null
+++ b/Content/posts/2023-02-08-Interact-with-siri-from-the-terminal.md
@@ -0,0 +1,229 @@
+---
+date: 2023-02-08 17:21
+description: Code snippet to interact with Siri by issuing commands from the command-line.
+tags: Tutorial, Code-Snippet, Python, Siri, macOS, AppleScript
+---
+
+# Interacting with Siri using the command line
+
+My main objective was to see if I could issue multi-intent commands in one go. Obviously, Siri cannot do that (neither can Alexa, Cortana, or Google Assistant). The script here can issue either a single command, or use the help of OpenAI's DaVinci model to extract multiple commands and pass them onto siri.
+
+## Prerequisites
+
+* Run macOS
+* Enable type to Siri (Settings > Accessibility -> Type to Siri)
+* Enable the Terminal to control System Events (The first time you run the script, it will prompt you to enable it)
+
+## Show me ze code
+
+If you are here just for the code:
+
+```python
+import argparse
+import applescript
+import openai
+
+from os import getenv
+
+openai.api_key = getenv("OPENAI_KEY")
+engine = "text-davinci-003"
+
+def execute_with_llm(command_text: str) -> None:
+	llm_prompt = f"""You are provided with multiple commands as a single command. Break down all the commands and return them in a list of strings. If you are provided with a single command, return a list with a single string, trying your best to understand the command.
+	
+	Example:
+	Q: "Turn on the lights and turn off the lights"
+	A: ["Turn on the lights", "Turn off the lights"]
+
+	Q: "Switch off the lights and then play some music"
+	A: ["Switch off the lights", "Play some music"]
+
+	Q: "I am feeling sad today, play some music"
+	A: ["Play some cheerful music"]
+
+	Q: "{command_text}"
+	A: 
+	"""
+
+	completion = openai.Completion.create(engine=engine, prompt=llm_prompt, max_tokens=len(command_text.split(" "))*2)
+
+	for task in eval(completion.choices[0].text):
+		execute_command(task)
+
+
+def execute_command(command_text: str) -> None:
+	"""Execute a Siri command."""
+
+	script = applescript.AppleScript(f"""
+		tell application "System Events" to tell the front menu bar of process "SystemUIServer"
+			tell (first menu bar item whose description is "Siri")
+				perform action "AXPress"
+			end tell
+		end tell
+
+		delay 2
+
+		tell application "System Events"
+			set textToType to "{command_text}"
+			keystroke textToType
+			key code 36
+		end tell
+	""")
+
+	script.run()
+
+
+if __name__ == "__main__":
+	parser = argparse.ArgumentParser()
+	parser.add_argument("command", nargs="?", type=str, help="The command to pass to Siri", default="What time is it?")
+	parser.add_argument('--openai', action=argparse.BooleanOptionalAction, help="Use OpenAI to detect multiple intents", default=False)
+	args = parser.parse_args()
+
+	if args.openai:
+		execute_with_llm(args.command)
+	else:
+		execute_command(args.command)
+```
+
+Usage:
+
+```bash
+python3 main.py "play some taylor swift"
+python3 main.py "turn off the lights and play some music" --openai
+```
+
+## ELI5 
+
+I am not actually going to explain it as if I am explaining to a five-year old kid.
+
+### AppleScript
+
+In the age of Siri Shortcuts, AppleScript can still do more. It is a scripting language created by Apple that can help you automate pretty much anything you see on your screen.
+
+We use the following AppleScript to trigger Siri and then type in our command:
+
+```applescript
+tell application "System Events" to tell the front menu bar of process "SystemUIServer"
+	tell (first menu bar item whose description is "Siri")
+		perform action "AXPress"
+	end tell
+end tell
+
+delay 2
+
+tell application "System Events"
+	set textToType to "Play some rock music"
+	keystroke textToType
+	key code 36
+end tell
+```
+
+This first triggers Siri, waits for a couple of seconds, and then types in our command. In the script, this functionality is handled by the `execute_command` function.
+
+```python
+import applescript
+
+def execute_command(command_text: str) -> None:
+	"""Execute a Siri command."""
+
+	script = applescript.AppleScript(f"""
+		tell application "System Events" to tell the front menu bar of process "SystemUIServer"
+			tell (first menu bar item whose description is "Siri")
+				perform action "AXPress"
+			end tell
+		end tell
+
+		delay 2
+
+		tell application "System Events"
+			set textToType to "{command_text}"
+			keystroke textToType
+			key code 36
+		end tell
+	""")
+
+	script.run()
+```
+
+### Multi-Intent Commands
+
+We can call OpenAI's API to autocomplete our prompt and extract multiple commands. We don't need to use OpenAI's API, and can also simply use Google's Flan-T5 model using HuggingFace's transformers library. 
+
+#### Ze Prompt
+
+```text
+You are provided with multiple commands as a single command. Break down all the commands and return them in a list of strings. If you are provided with a single command, return a list with a single string, trying your best to understand the command.
+	
+	Example:
+	Q: "Turn on the lights and turn off the lights"
+	A: ["Turn on the lights", "Turn off the lights"]
+
+	Q: "Switch off the lights and then play some music"
+	A: ["Switch off the lights", "Play some music"]
+
+	Q: "I am feeling sad today, play some music"
+	A: ["Play some cheerful music"]
+
+	Q: "{command_text}"
+	A:
+```
+
+This prompt gives the model a few examples to increase the generation accuracy, along with instructing it to return a Python list. 
+
+
+#### Ze Code
+
+```python
+import openai
+
+from os import getenv
+
+openai.api_key = getenv("OPENAI_KEY")
+engine = "text-davinci-003"
+
+def execute_with_llm(command_text: str) -> None:
+	llm_prompt = f"""You are provided with multiple commands as a single command. Break down all the commands and return them in a list of strings. If you are provided with a single command, return a list with a single string, trying your best to understand the command.
+	
+	Example:
+	Q: "Turn on the lights and turn off the lights"
+	A: ["Turn on the lights", "Turn off the lights"]
+
+	Q: "Switch off the lights and then play some music"
+	A: ["Switch off the lights", "Play some music"]
+
+	Q: "I am feeling sad today, play some music"
+	A: ["Play some cheerful music"]
+
+	Q: "{command_text}"
+	A: 
+	"""
+
+	completion = openai.Completion.create(engine=engine, prompt=llm_prompt, max_tokens=len(command_text.split(" "))*2)
+
+	for task in eval(completion.choices[0].text): # NEVER EVAL IN PROD RIGHT LIKE THIS
+		execute_command(task)
+```
+
+
+### Gluing together code
+
+To finish it all off, we can use argparse to only send the input command to OpenAI when asked to do so.
+
+```python
+import argparse
+
+if __name__ == "__main__":
+	parser = argparse.ArgumentParser()
+	parser.add_argument("command", nargs="?", type=str, help="The command to pass to Siri", default="What time is it?")
+	parser.add_argument('--openai', action=argparse.BooleanOptionalAction, help="Use OpenAI to detect multiple intents", default=False)
+	args = parser.parse_args()
+
+	if args.openai:
+		execute_with_llm(args.command)
+	else:
+		execute_command(args.command)
+```
+
+## Conclusion
+
+Siri is still dumb. When I ask it to `Switch off the lights`, it default to the home thousands of miles away. But, this code snippet definitely does work!
+\ No newline at end of file
author	navanchauhan <navanchauhan@gmail.com>	2023-02-08 17:40:27 -0700
committer	navanchauhan <navanchauhan@gmail.com>	2023-02-08 17:40:27 -0700
commit	e3c2fac4f49859268d2f337ecaa64c41e3a6bd1d (patch)
tree	d336786cbcd3f8d39f4914b734741378126d22df /Content/posts
parent	0d34e3633ba36edd049ce3ba52908e061b7273d5 (diff)