5 minute challenge: summarize audio
This guide will show you how to swiftly build from scratch a workflow that transcribes and summarizes audio recordings
Go to https://app.scade.pro/flow/ and click the New flow button and then choose Start with blank


There will already be Start end End nodes on the canvas. Place everything else between them. Go to the Settings of the Start Node Form and click Configure fields to start adding your data


Add one field, call it, for example,
audio
, and change the input type toString/URI
to make it suitable for storing files. Click Save

You can upload the audio file from your computer or, alternatively, use the Set url button if you have a link to the file. We'll use Carl Sagan's "The pale blue dot" speech as an example file. Here's the link: https://tile.loc.gov/storage-services/media/ls/sagan/1958124-3-1.mp3

Hit the Execute button and then click Save and execute to run this node and generate the output. We need it to move further.


Use the search panel on the left to find a processor that transcribes audio. If you already know the name of your desired processor, type it. Alternatively, you can type what you want to do — like
audio to text
there and pick by the name and description. whisperx-a40-large looks promising, drag it to the canvas

Connect the
audio
output of the Start node with the input of the whisperx node and click the Execute button on the latter. It will take a while to transcribe an audio file

Drag the ChatGPT processor to your canvas, connect the output of the whisperx with the ChatGPT's input and go to the settings of the latter. Review the instructions in paragraphs 6-8 if needed. You can move your nodes around whatever you like to create a convinient workspace

Locate the Messages section in the settings and click the pencil to add instructions for the ChatGPT

Click Add message. Change the type of the message to
System
as recommended by OpenAI to emphasize that it's a high-level instruction and to guide the model's behaviour

The prompt here is rather straightforward: summarize a following audio transcript

Add another message and click the # symbol. The Expression editor will open. In case of the ChatGPT node we need it to get data from other nodes

There is a list of the nodes on the left. Click on the whisprex-a40-large to see its output data. Keep going until you end up with segments
— that's where the transcription is stored.
Drag the segments
to the expression field

The result should look similar to this. Don't forget to click the Save buttons in the Expression editor and before closing the node settings.

Click the Execute button on the ChatGPT node after you set everything
In principle, you are good to go now: the ChatGPT node generates summary, and you can copy and use it. But let's do some housekeeping to make sure that this workflow will be reusable and suitable to API integrations. Open settings of the End Node Form

Add a field and name it, there's no need to change the field type now

Connect the success
output of the ChatGPT node with the newly generated input of the End node and execute the End node

Now you have your result in the End node. Why it's important? We'll cover this later If you hover the text you'll see the button to copy and take it elsewhere

Once you have the workflow built, there's no need to run nodes one by one anymore. Simply change the file in the Start node and use the Play button on the top panel to launch the entire workflow.

And one last thing. It's a good idea to give your workflow a comprehensive name. Locate the cog in the top bar, fill in the suitable Flow name and click Submit

Last updated