The Struggle: Transcribe stuff for free with Whisper and WSL/Linux

← Back to feed

The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Permalink

Published: 2025-12-23 11:26:07

Discovered: 2026-03-19 13:50:20

Author: 1

Hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1

https://www.tornevalls.se/the-struggle-transcribe-stuff-for-free-with-whisper-and-wsl-linux-with-a-gtx-1060/

Description

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents

Toggle

whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself

whisper.bat

@echo off

setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)

chcp 65001 >nul

REM File passed from Explorer

set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)

for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file

wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]

@="Transkribera med Whisper (WSL)"

"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]

@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash

set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"

MODE="install"

# --- Parse args ---

while getopts ":u" opt; do

 case "$opt" in

 u) MODE="uninstall" ;;

 *)

 echo "Usage: $0 [-u]"

 exit 1

 ;;

 esac

done

echo "==> Whisper installer (GTX 1060 compatible)"

echo "==> Mode: $MODE"

# --- Sanity ---

if [[ ! -d "$VENV_DIR" ]]; then

 echo "Error: venv not found: $VENV_DIR"

 exit 1

fi

# shellcheck disable=SC1090

source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================

# UNINSTALL MODE (-u)

# ==================================================

if [[ "$MODE" == "uninstall" ]]; then

 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true

 pip uninstall -y numpy || true

 echo ""

 echo "Done."

 echo "Uninstall completed. Nothing else touched."

 exit 0

fi

# ==================================================

# INSTALL MODE (DEFAULT)

# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \

 numpy==1.26.4 \

 torch==1.13.1+cu116 \

 torchvision==0.14.1+cu116 \

 torchaudio==0.13.1 \

 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---

echo "==> Verifying environment"

python - << 'EOF'

import torch, numpy

print("Torch:", torch.__version__)

print("NumPy:", numpy.__version__)

print("CUDA available:", torch.cuda.is_available())

if torch.cuda.is_available():

 print("GPU:", torch.cuda.get_device_name(0))

 print("Capability:", torch.cuda.get_device_capability(0))

EOF

echo ""

echo "Done."

echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash

set -euo pipefail

# whisper-run.sh

# Usage:

# whisper <input.extension> [model] [language]

#

# Output:

# <input-filename>.txt (same directory)

#

# Behaviour:

# - Refuses to overwrite existing .txt

# - Stops execution if output exists

if [[ $# -lt 1 ]]; then

 echo "Usage: whisper <input.extension> [model] [language]"

 exit 1

fi

INPUT="$1"

MODEL="${2:-small}"

LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then

 echo "Error: Input file not found: $INPUT"

 exit 1

fi

BASENAME="$(basename "$INPUT")"

STEM="${BASENAME%.*}"

OUTDIR="$(dirname "$INPUT")"

OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---

if [[ -f "$OUTPUT" ]]; then

 echo "Error: Output file already exists:"

 echo " $OUTPUT"

 echo "Aborting to avoid overwrite."

 exit 1

fi

# Prefer venv whisper if installed via install script

WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"

WHISPER_BIN="whisper"

if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then

 WHISPER_BIN="$WHISPER_VENV/bin/whisper"

fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then

 echo "Error: whisper not found in PATH or venv."

 exit 1

fi

TMPDIR="$(mktemp -d)"

cleanup() { rm -rf "$TMPDIR"; }

trap cleanup EXIT

echo "==> Transcribing:"

echo " input: $INPUT"

echo " output: $OUTPUT"

echo " model: $MODEL"

echo " lang: ${LANGUAGE:-auto}"

ARGS=(

 "$INPUT"

 --model "$MODEL"

 --output_dir "$TMPDIR"

 --output_format txt

 --task transcribe

 --verbose False

 --fp16 False

)

if [[ -n "$LANGUAGE" ]]; then

 ARGS+=( --language "$LANGUAGE" )

fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"

if [[ ! -f "$GENERATED_TXT" ]]; then

 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"

 if [[ -z "${FOUND_TXT:-}" ]]; then

 echo "Error: No .txt output produced."

 exit 1

 fi

 GENERATED_TXT="$FOUND_TXT"

fi

# --- Final move (no overwrite possible due to earlier check) ---

mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"

echo " $OUTPUT"

History (4 versions shown )

From 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d

To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:16:26) hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0

Title

The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Description

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final

From 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1

To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d

Title

The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Description

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text ~~transcribed~~ to be ~~pasted~~ ~~into~~ ~~Suno,~~ ~~that~~ ~~only~~ ~~exists~~ as a ~~m4a-file~~ ~~(i.e.~~ ~~music,~~ ~~that~~ ~~sometimes~~ ~~has~~ ~~hardcoded~~ ~~subtitles~~ ~~that...~~ […]

Content

From 2025-12-23 11:26:07 (discovered: 2026-02-05 14:24:03) hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d

To 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1

Title

The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Description

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, ~~youâ€™re~~ you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well ~~(Iâ€™m~~ (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer ~~fÃ¶r~~ för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes Ã¥ Ã¤ ~~Ã¶)~~ å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer ~~fÃ¶r~~ för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final

Versions

2025-12-23 11:26:07

Discovered: 2026-04-24 08:16:26 Hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0

Title:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Description:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"

pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true

echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
2025-12-23 11:26:07

Discovered: 2026-04-24 08:14:23 Hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d

Title:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Description:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"

pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true

echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
2025-12-23 11:26:07

Discovered: 2026-03-19 13:50:20 Hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1

Title:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Description:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"

pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true

echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
2025-12-23 11:26:07

Discovered: 2026-02-05 14:24:03 Hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d

Title:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Description:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, youâ€™re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (Iâ€™m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer fÃ¶r WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes Ã¥ Ã¤ Ã¶)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer fÃ¶r WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"

pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true

echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"

Tornevall Networks

The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Changes

Versions