Invoking the MacOS speech-to-text

What would be the appropriate way to use the speech-to-text functionality of MacOS to let users dictate what will fill a TextField?
The press of a JButton should trigger this, starting the voice input, capturing the output text, and streaming this output in the text field.

The only way that comes to my mind is to ask users to enable the Voice Control feature in the Accessibility section of macOS System Settings. When it’s turned on, one can dictate any text into a text field when it has the keyboard focus. Unfortunately, there is a known issue with it (IJPL-175022), so it might be unstable at the moment.