A quick introduction to Talon, a hands-free input system that lets you control you computer with your voice, and Python.
6 min read
·
By Gaute Andreas Berge
·
December 22, 2021
When I started getting my master’s degree I had just developed a repetitive strain injury (RSI) that made it painful to impossible to use a keyboard, mouse or even a phone. Still, I was able to finish my degree without any delay, as well as a summer internship at Bekk. The reason I was able to do this was because I had discovered Talon, a hands free input system, which allowed me to use the computer proficiently using only a microphone. These days the pain is not as bad as it used to be and I’m not fully reliant on Talon any more, but when I’m primarily using a keyboard I find myself missing some of the awesome features Talon has to offer. I really believe that systems like Talon can be really effective even for people without disabilities so in this article I will demonstrate how you can use talon to improve your workflow and significantly increase your wow-factor when pair programming on Zoom 😉
Using Talon
If you’ve want to follow along with the examples you can install Talon from here.Talon does not come with any built-in commands. If you want to be able to control your computer fully using only Talon I recommend downloading a larger command set such as knausj_talon
, but if you’re just playing around you can simply create a file in the user directory (%APPDATA%\Talon\user
on Windows, and ~/.talon/user
on macOS/Linux.).
Let’s get started with the simplest example:
say hello: "Hello From Talon!"
Uttering the phrase say hello
will now make Talon type Hello from Talon!
.
You can also emulate any keystroke, for example to switch to the last opened window:
switch: key(cmd-tab) # or ctrl-tab for Linux/Windows
This is not really faster than pressing the key combination yourself, but voice commands are much easier to remember than keyboard shortcuts, and it’s really easy to chain them together to make macros.
Voice commands can also capture parts of a spoken phrase to do interesting things with them.
call <word>:
"{word}()"
key(left)
The above lets me say for example call print
to produce print()
with the cursor inside the parentheses. Now we have what we need to create some useful snippets!
for <word> in <word>: "for {word_1} in {word_2}: "
add to do <phrase>: "// TODO: {phrase}"
You can of course be really productive with just the snippets in your editor, but this allows you to leverage the naturally high throughput you have with your voice, and they have the benefit of being available everywhere (notepad, slack, browsers, etc).
If this syntax reminds you of pythons f-strings
, then that’s great, because they are! That means we can write python expressions and evaluate them. We could for example create a spoken calculator with the number
capture:
multiply <number> by <number>: "{number_1 * number_2}"
Now saying multiply two by three
will produce 6
🤯
Talon can also respond to keyboard events, so the macros you write for Talon can still be used when you’re sitting in an office and don’t want to use a microphone. For example:
key(alt-t): speech.toggle() # toggle Talon with alt + t
Getting more advanced
You can do a lot with just the built-in scripting language, but it is when you start to add python modules things start to get really interesting. If you’re still following along, go ahead and create a python file next to the .talon
file you created earlier.
import os
from talon import Context, Module, actions, clip
module = Module()
context = Context()
I haven’t explained what modules or contexts are, but for now it’s sufficient to know that they are used to control when groups of actions and voice commands are active. This makes it so that you can’t accidentally trigger application specific commands when that application is not open, and you can define actions that work differently based on the context you’re in, such as which programming language you’re using.
The following example shows how it can use Talon’s clipboard API to grab the visually selected text and append it to a file called notes.md in your home directory.
@module.action_class
class Actions:
def note_selected_text():
"save the selected text as a bullet point in a file"
with clip.capture() as clipboard:
actions.key("cmd-c")
try:
text = clipboard.get()
except clip.NoChange:
return
message = f"- {text}\n"
with open(os.path.expanduser("~/notes.md"), "a") as file: # make sure the file exists
file.write(message)
clip.capture()
allows us to work with the clipboard while making sure it gets restored once we are done with it. And just like that, we have a simple note tracking script that can be triggered in any application that allows copy pasting. The action can be invoked from a voice command like so:
make note of that: user.note_selected_text()
Going back to a programming example, let’s revisit our command for invoking python functions, which has a couple of issues. Not all function names are easy to pronounce, so to address this we can define a Talon list
which associates a spoken form with some output:
module.list("python_functions", desc="python functions")
context.lists["user.python_functions"] = {
"print": "print",
"join": ".join",
"compare files": "cmpfiles",
}
We can then reference this list in a voice command as {user.python_functions}
. Another issue is that we used the word capture which only captures a single word, but we want to be able to dictate longer function names and apply snake case forming. One way to do this to define a custom capture which we can use to parse user input and process it however we like:
@module.capture(rule="({user.python_functions}|<phrase>)")
def python_function_name(m) -> str:
# get the first capture
text = m[0]
# make every word lower case, and join them by "_"
return "_".join((word.lower() for word in text.split()))
Rules are defined like regular expressions except that they work on whole words instead of characters, so hello*
matches hello hello hello
, and not hellooo
. The above capture will match any phrase, but if it matches something in the python_functions
list it will override the output. We can then update our voice command to be:
call <user.python_function_name>:
"{python_function_name}()"
key(left)
There’s a lot more fun to be had
The things I presented in this article only scratches the surface of what you can do with Talon. Talon also has (among other things):
file watching
knausj_talon
such as google that
which will google the selected text, or launch/focus
which will open / focus any application you say.If any of this excites you, feel free to check out the official website, knausj_talon, and join the Slack channel (link on the website).
I can also recommend Emily Shea’s talk Voice Driven Development: Who needs a keyboard anyway?
Conclusion
Talon has a lot of features that I think can be really useful for anyone who likes optimising parts of their workflow. Even if you don’t want to use voice commands in your normal workflow, you can use it as a general purpose scripting environment and create macros that will work anywhere. If you need any help getting started feel free to contact me or anyone else in the slack 😊
Loading…