Building a flow: video transcription and summarization

Life is too short for watching long videos? Let's build a workflow to transcribe and summarize any Youtube video.

  1. Go to the Flow and click the "Create" button.

  2. It's a good practice to store all your input data in a single node. Start by typing user input in the search bar of the left panel. Locate User-Defined Input Form and drag it to the workspace.

  3. This node has no pre-configured fields, so we should set them ourselves. Dive into the node’s settings and select Configure Fields.

  4. Add a field to store the link to the video. Let's name this field video_url. We don't have to change the field type or anything else. Click Save.

  5. Paste the Youtube link to the node, hit the Execute button and then click Save and execute to run this node and generate the output. We need it to move further. A video that is used as an example in this flow:

    Here's the output

  6. Start typing in the search bar what you want to do next: video transcription. We've found two models, let's give one of them a try. Drag whisperx-video-transcribe to your workspace

  7. Сonnect the video_url output of your User-Defined Input Form node to the Url input of your whisperx-video-transcribe node.

  8. Execute the whisperx-video-transcribe node. It will take a while to transcribe a video; the longer the video is, the more time is needed.

  9. We have a transcription, let's summarize it. Start typing chatgpt processor in the left panel and drag a ChatGPT Processor node to your workspace.

  10. Go to settings of the ChatGPT Processor node. You can choose ChatGPT versions with the Model dropdown. Let's choose gpt-4 because the transcript could be too long for older versions

  11. Now it's time to prepare ChatGPT for receiving instructions. In the node settings, go to Messages section, click on a pencil and a new message

    Change the type of the message to System as recommended by OpenAI to emphasize that it's a high-level instruction and to guide the model's behaviour.

  12. In the Message field write what you need ChatGPT to do: something like summarize a following video transcript

  13. Add another message (there's no need to change the message type this time). Here's a tricky part: due to the syntax of the ChatGPT Processor node we will need to use Expression editor. Click on the # symbol in the lower Message field.

    You'll get the Expression editor. There is a list of the nodes on the left. Click on the whisperx-video-transcribe to see its output.

    Drag the success output to the expression field

    The result should look like this

    Click the Save button in the Expression editor and don't forget to click top-right Save as well before leaving the settings.

  14. Connect the output of the whisperx-video-transcribe to the input of the ChatGPT Processor node and run the latter by clicking the Execute button on the node. An important note: don't click the Start button on the top panel yet! If you do this, all nodes will start over, and we don't want this because video transcription is a quite time-consuming task.

  15. When the ChatGPT finishes its work, you'll see that the text in this node is way shorter than the initial transcript. You can read it right away or copy it — just hover over the text to see the respective icon.

    Of course, you can copy the full transcript as well.

  16. Like storing all input data in a single node, collecting all output data in a single node is a good practice as well. For example, it will simplify matters greatly if you are going to launch the workflow via API in the future. Let's grab another User-Defined Input Form and add a couple of fields. Name them, for example, full_transcript and summary. Refer to paragraphs 2, 3 and 4 of this guide if needed.

  17. Connect the output of the whisperx-video-transcribe node to the Full_transcript input and the Success output of the ChatGPT Processor to the Summary input respectively. Run the final node by clicking the Execute button

  18. Now both a full transcript and a summary are stored in one node.

And finally: once you have the workflow built, there's no need to run nodes one by one anymore. To transcribe and summarize another video, simply change the link in the Video_url field of the User-Defined Input Form and use the Start button on the top bar this time.

Last updated