toc2audio

What is this?

This tool parses a Table of Contents file and:

  • Optionally, shows HTML in your browser and print it too on the console, for copy&paste or redirection to complete your show notes.

  • TODO: Optionally, adds the TOC generated HTML to an audio file.

  • Optionally, adds timestamps (chapters) from your TOC file to an audio file.

  • Optionally, adds timeoffsets to all timestamps (in the HTML and in the chapters) in order to compensate from initial presentation or teasers, advertisements during the audio, etc.

If the audio file already has chapter/TOC metadata, we will replace it as requested. The rest of the metadata presents in the original file will be preserved.

Install

$ python3 -m pip install toc2audio

Command line

$ toc2audio --version
0.5.0

$ toc2audio -h
usage: toc2audio.py [-h] [--version] [--offset OFFSET] [--show]
                    [--toc] [--chapters]
                    [TOC] [AUDIO ...]

Add Table of Contents and chapters to audio files

positional arguments:
  TOC              Table of Contents file
  AUDIO            Audio file

optional arguments:
  -h, --help       show this help message and exit
  --version        Show version info
  --offset OFFSET  Seconds or [HH:]MM:SS to add to ALL timestamps
  --show           Show the generated HTML in your browser
  --toc            Store Table of Contents in the audio file
  --chapters       Store chapters details in the audio file

Table of Contents format

The Table of Contents must be written in markdown.

toc2audio will parse any markdown file and will, optionally, insert the generated HTML and chapters metadata in your audio file. If you want to use timestamps (chapters), you must use lines in this format:

[HH:MM:SS] Chapter title

HH:MM:SS is hours:minutes:seconds. The “hours” field is optional. You can specify fields with one or two digits.

An example would be:

This audio was recorded blah blah...

* [00:50] Presentation

    Here I describe the topics we will talk about.

* [02:11] Topic 1

    Blah blah blah blah...

* [17:29] Topic 2

    Blah blah blah blah...

Note

Notice that when list items have multiple paragraphs, each subsequent paragrap MUST BE indented by either FOUR spaces or a tab, as documented in Markdown Syntax Documentation and in the documentation of Python-Markdown project.

Warning

If the first chapter doesn’t start at “[00:00]” for whatever reason (including specifying a Time offset), a dummy chapter will be implicitly added covering from “[00:00]” to the first chapter.

Time offset

You can apply a global time offset to all timestamps in the TOC markdown document using the --offset command line parameter.

Supported audio containers

Supported audio containers are:

  • Opus. If you can choose an audio format freely, you should choose Opus. It is the current (2021) state-of-art for general purpose audio (voice and music) and free of patents. It is “the greatest thing since sliced bread”.

  • Vorbis.

  • MP3.

    Warning

    In many MP3 players, the MP3 file MUST BE CBR in order for the chapter metadata seeking to be accurate.

  • M4A (MP4 audio).

    Usually, MP4 audiobooks have a m4b extension to advertise the presence of bookmarks. Nevertheless, the file is bitwise identical to m4a. Some software doesn’t recognize m4b files, so I use a m4a suffix.

    Usually, the audio format will be AAC, HE-AAC or HE-AACv2, but I don’t really care. I manipulate the generic MP4 container, I don’t pay attention to the audio data. I guess I could even add chapters to video data.

    Warning

    This feature requires availability of FFmpeg software.

Author and License

The author of this package is Jesús Cea Avión.

This code is licensed under Affero GNU Public License v3 (AGPLv3)