5 minute challenge: summarize audio
Last updated
Last updated
This guide will show you how to swiftly build from scratch a workflow that transcribes and summarizes audio recordings
Go to https://app.scade.pro/flow/ and click the New flow button and then choose Start with blank
There will already be Start end End nodes on the canvas. Place everything else between them. Go to the Settings of the Start Node Form and click Configure fields to start adding your data
Add one field, call it, for example, audio
, and change the input type to String/URI
to make it suitable for storing files. Click Save
You can upload the audio file from your computer or, alternatively, use the Set url button if you have a link to the file. We'll use Carl Sagan's "The pale blue dot" speech as an example file. Here's the link: https://tile.loc.gov/storage-services/media/ls/sagan/1958124-3-1.mp3
Hit the Execute button and then click Save and execute to run this node and generate the output. We need it to move further.
Use the search panel on the left to find a processor that transcribes audio. If you already know the name of your desired processor, type it. Alternatively, you can type what you want to do — like audio to text
there and pick by the name and description.
whisperx-a40-large looks promising, drag it to the canvas
Connect the audio
output of the Start node with the input of the whisperx node and click the Execute button on the latter. It will take a while to transcribe an audio file
Drag the ChatGPT processor to your canvas, connect the output of the whisperx with the ChatGPT's input and go to the settings of the latter. Review the instructions in paragraphs 6-8 if needed. You can move your nodes around whatever you like to create a convinient workspace
Locate the Messages section in the settings and click the pencil to add instructions for the ChatGPT
Click Add message.
Change the type of the message to System
as recommended by OpenAI to emphasize that it's a high-level instruction and to guide the model's behaviour
The prompt here is rather straightforward: summarize a following audio transcript
Add another message and click the # symbol. The Expression editor will open. In case of the ChatGPT node we need it to get data from other nodes
There is a list of the nodes on the left. Click on the whisprex-a40-large to see its output data. Keep going until you end up with segments
— that's where the transcription is stored.
Drag the segments
to the expression field
The result should look similar to this. Don't forget to click the Save buttons in the Expression editor and before closing the node settings.
Click the Execute button on the ChatGPT node after you set everything
In principle, you are good to go now: the ChatGPT node generates summary, and you can copy and use it. But let's do some housekeeping to make sure that this workflow will be reusable and suitable to API integrations. Open settings of the End Node Form
Add a field and name it, there's no need to change the field type now
Connect the success
output of the ChatGPT node with the newly generated input of the End node and execute the End node
Now you have your result in the End node. Why it's important? We'll cover this later If you hover the text you'll see the button to copy and take it elsewhere
Once you have the workflow built, there's no need to run nodes one by one anymore. Simply change the file in the Start node and use the Play button on the top panel to launch the entire workflow.
And one last thing. It's a good idea to give your workflow a comprehensive name. Locate the cog in the top bar, fill in the suitable Flow name and click Submit