← Back to feeds | Find edited posts

Tornevalls Blog

URLID: 10
Source URL: https://www.tornevalls.se
Categories: Tornevall

Log in to subscribe to heads-up notifications for this feed or its category via email, Slack, or Discord.

RSS endpoint: https://tools.tornevall.net/api/rss/feed/10
Ask about this feed (click to open)

Ask the AI anything about content, patterns, and edits for Tornevalls Blog. The AI will receive full version history including all edited articles. Open question history.

Use "All time" to search across the full stored database. Version history is still included when the question stays site-focused.
Strict keeps retrieval close to your wording. Expansive lets the AI broaden related terms before final analysis.
Guest limits: 0/6 today, 0/12 this week.

People are predicting Suno’s death – how likely is it?

Permalink
Published: 2025-12-29 10:32:07
Discovered: 2026-04-24 08:14:23
Author: 1
Hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
https://www.tornevalls.se/people-are-predicting-sunos-death-how-likely-is-it/
Description

Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]

Content

Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.


History — 4 versions shown

Changes

From 2025-12-29 10:32:07 (discovered: 2026-04-24 08:14:23) hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
To 2025-12-29 10:32:07 (discovered: 2026-04-24 08:16:25) hash: 2f7e4a0d8c75dd7792918888f337748908ecb6f5
Title
People are predicting Suno’s death – how likely is it?
Description
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […]
Content
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten.
Old vs new
From
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
To
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […]

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
From 2025-12-29 10:32:07 (discovered: 2026-03-19 13:50:20) hash: 696e004598696c9b32ec7879894bc14619881ea9
To 2025-12-29 10:32:07 (discovered: 2026-04-24 08:14:23) hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
Title
People are predicting Suno’s death – how likely is it?
Description
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... […]
Content
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten.
Old vs new
From
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
To
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
From 2025-12-29 10:32:07 (discovered: 2026-02-05 14:24:03) hash: e7e71a61897ed20875ea43e501419311b91b35b1
To 2025-12-29 10:32:07 (discovered: 2026-03-19 13:50:20) hash: 696e004598696c9b32ec7879894bc14619881ea9
Title
People are predicting Suno’s death – how likely is it?
Description
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
Content
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten.
Old vs new
From
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
To
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.

Versions

  1. 2025-12-29 10:32:07
    Discovered: 2026-04-24 08:16:25 Hash: 2f7e4a0d8c75dd7792918888f337748908ecb6f5
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […]
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
  2. 2025-12-29 10:32:07
    Discovered: 2026-04-24 08:14:23 Hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
  3. 2025-12-29 10:32:07
    Discovered: 2026-03-19 13:50:20 Hash: 696e004598696c9b32ec7879894bc14619881ea9
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
  4. 2025-12-29 10:32:07
    Discovered: 2026-02-05 14:24:03 Hash: e7e71a61897ed20875ea43e501419311b91b35b1
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.

People are predicting Suno’s death – how likely is it?

Permalink
Published: 2025-12-29 10:32:07
Discovered: 2026-03-19 13:50:20
Author: 1
Hash: 696e004598696c9b32ec7879894bc14619881ea9
https://www.tornevalls.se/people-are-predicting-sunos-death-how-likely-is-it/
Description

Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

Content

Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.


History — 4 versions shown

Changes

From 2025-12-29 10:32:07 (discovered: 2026-04-24 08:14:23) hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
To 2025-12-29 10:32:07 (discovered: 2026-04-24 08:16:25) hash: 2f7e4a0d8c75dd7792918888f337748908ecb6f5
Title
People are predicting Suno’s death – how likely is it?
Description
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […]
Content
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten.
Old vs new
From
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
To
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […]

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
From 2025-12-29 10:32:07 (discovered: 2026-03-19 13:50:20) hash: 696e004598696c9b32ec7879894bc14619881ea9
To 2025-12-29 10:32:07 (discovered: 2026-04-24 08:14:23) hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
Title
People are predicting Suno’s death – how likely is it?
Description
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... […]
Content
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten.
Old vs new
From
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
To
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
From 2025-12-29 10:32:07 (discovered: 2026-02-05 14:24:03) hash: e7e71a61897ed20875ea43e501419311b91b35b1
To 2025-12-29 10:32:07 (discovered: 2026-03-19 13:50:20) hash: 696e004598696c9b32ec7879894bc14619881ea9
Title
People are predicting Suno’s death – how likely is it?
Description
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
Content
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten.
Old vs new
From
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
To
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.

Versions

  1. 2025-12-29 10:32:07
    Discovered: 2026-04-24 08:16:25 Hash: 2f7e4a0d8c75dd7792918888f337748908ecb6f5
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […]
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
  2. 2025-12-29 10:32:07
    Discovered: 2026-04-24 08:14:23 Hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
  3. 2025-12-29 10:32:07
    Discovered: 2026-03-19 13:50:20 Hash: 696e004598696c9b32ec7879894bc14619881ea9
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
  4. 2025-12-29 10:32:07
    Discovered: 2026-02-05 14:24:03 Hash: e7e71a61897ed20875ea43e501419311b91b35b1
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.

People are predicting Suno’s death – how likely is it?

Permalink
Published: 2025-12-29 10:32:07
Discovered: 2026-02-05 14:24:03
Author: 1
Hash: e7e71a61897ed20875ea43e501419311b91b35b1
https://www.tornevalls.se/people-are-predicting-sunos-death-how-likely-is-it/
Description

Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

Content

Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.


History — 4 versions shown

Changes

From 2025-12-29 10:32:07 (discovered: 2026-04-24 08:14:23) hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
To 2025-12-29 10:32:07 (discovered: 2026-04-24 08:16:25) hash: 2f7e4a0d8c75dd7792918888f337748908ecb6f5
Title
People are predicting Suno’s death – how likely is it?
Description
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […]
Content
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten.
Old vs new
From
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
To
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […]

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
From 2025-12-29 10:32:07 (discovered: 2026-03-19 13:50:20) hash: 696e004598696c9b32ec7879894bc14619881ea9
To 2025-12-29 10:32:07 (discovered: 2026-04-24 08:14:23) hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
Title
People are predicting Suno’s death – how likely is it?
Description
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... […]
Content
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten.
Old vs new
From
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
To
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
From 2025-12-29 10:32:07 (discovered: 2026-02-05 14:24:03) hash: e7e71a61897ed20875ea43e501419311b91b35b1
To 2025-12-29 10:32:07 (discovered: 2026-03-19 13:50:20) hash: 696e004598696c9b32ec7879894bc14619881ea9
Title
People are predicting Suno’s death – how likely is it?
Description
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
Content
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten.
Old vs new
From
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
To
TITLE:
People are predicting Suno’s death – how likely is it?

DESCRIPTION:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...

CONTENT:
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

Table of Contents
Toggle
What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
What is actually happening

The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

Why the “Suno will die” narrative keeps showing up

This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

Latest claims I have seen in that thread

Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

Claim: “A settlement will force a ‘clean model’ and kill creativity”

What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

Claim: “You don’t own anything, you are renting, and your catalog can vanish”

This is the part where people accidentally become correct, but for the wrong reasons.

Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

The practical takeaway is simple and non-dramatic:

Back up your WAVs/stems and project notes locally. Always.

Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

Possible: yes, as a policy choice.

Inevitable: no.

Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

So what are the real risks for Suno?

Think in terms of business incentives.

High probability changes

When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

Medium probability changes

It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

Lower probability, but still worth planning for

There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

What about us who actually do the work?

Here is the split that regulation will make clearer over time.

If you actually create something

If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

If you do nothing and just press generate

This is where it all goes to shit, and yes, this is exactly where regulation is needed.

When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

What you should do right now

This does not require panic. It does require using your head.

Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.

Versions

  1. 2025-12-29 10:32:07
    Discovered: 2026-04-24 08:16:25 Hash: 2f7e4a0d8c75dd7792918888f337748908ecb6f5
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […]
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
  2. 2025-12-29 10:32:07
    Discovered: 2026-04-24 08:14:23 Hash: ebc4881ceeb7fba08d75edb6c73fd894dce57b22
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
  3. 2025-12-29 10:32:07
    Discovered: 2026-03-19 13:50:20 Hash: 696e004598696c9b32ec7879894bc14619881ea9
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
  4. 2025-12-29 10:32:07
    Discovered: 2026-02-05 14:24:03 Hash: e7e71a61897ed20875ea43e501419311b91b35b1
    Title:
    People are predicting Suno’s death – how likely is it?
    Description:
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
    Content
    Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.

    That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.

    I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.

    Table of Contents
    Toggle
    What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now
    What is actually happening

    The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.

    At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.

    Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.

    Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.

    Why the “Suno will die” narrative keeps showing up

    This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.

    First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.

    Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.

    Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.

    The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.

    Latest claims I have seen in that thread

    Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”

    What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.

    What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.

    So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.

    Claim: “A settlement will force a ‘clean model’ and kill creativity”

    What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.

    What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.

    Claim: “You don’t own anything, you are renting, and your catalog can vanish”

    This is the part where people accidentally become correct, but for the wrong reasons.

    Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.

    Contract reality matters: the ToS is designed to give the platform broad rights and broad control.

    The practical takeaway is simple and non-dramatic:

    Back up your WAVs/stems and project notes locally. Always.

    Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”

    Possible: yes, as a policy choice.

    Inevitable: no.

    Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.

    So what are the real risks for Suno?

    Think in terms of business incentives.

    High probability changes

    When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.

    Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.

    Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.

    Medium probability changes

    It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.

    In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.

    Lower probability, but still worth planning for

    There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.

    A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.

    What about us who actually do the work?

    Here is the split that regulation will make clearer over time.

    If you actually create something

    If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.

    At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.

    If you do nothing and just press generate

    This is where it all goes to shit, and yes, this is exactly where regulation is needed.

    When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.

    So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.

    What you should do right now

    This does not require panic. It does require using your head.

    Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.

    That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.

    Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.

    So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.

The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Permalink
Published: 2025-12-23 11:26:07
Discovered: 2026-04-24 08:16:26
Author: 1
Hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0
https://www.tornevalls.se/the-struggle-transcribe-stuff-for-free-with-whisper-and-wsl-linux-with-a-gtx-1060/
Description

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat

@echo off setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul

REM File passed from Explorer set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper "%WSL_FILE%""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe"

[HKEY_CLASSES_ROOT*\shell\WhisperWSL\command] @=""F:\viktigt\Private\Linux-Scripts\Whisper.bat" "%1""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install"

--- Parse args ---

while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done

echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE"

--- Sanity ---

if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi

shellcheck disable=SC1090

source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

==================================================

UNINSTALL MODE (-u)

==================================================

if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)"

pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true

echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi

==================================================

INSTALL MODE (DEFAULT)

==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install
numpy==1.26.4
torch==1.13.1+cu116
torchvision==0.14.1+cu116
torchaudio==0.13.1
--extra-index-url https://download.pytorch.org/whl/cu116

--- Verify ---

echo "==> Verifying environment" python - << 'EOF' import torch, numpy print("Torch:", torch.version) print("NumPy:", numpy.version) print("CUDA available:", torch.cuda.is_available()) if torch.cuda.is_available(): print("GPU:", torch.cuda.get_device_name(0)) print("Capability:", torch.cuda.get_device_capability(0)) EOF

echo "" echo "Done." echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash set -euo pipefail

whisper-run.sh

Usage:

whisper <input.extension> [model] [language]

Output:

.txt (same directory)

Behaviour:

- Refuses to overwrite existing .txt

- Stops execution if output exists

if [[ $# -lt 1 ]]; then echo "Usage: whisper <input.extension> [model] [language]" exit 1 fi

INPUT="$1" MODEL="${2:-small}" LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then echo "Error: Input file not found: $INPUT" exit 1 fi

BASENAME="$(basename "$INPUT")" STEM="${BASENAME%.*}" OUTDIR="$(dirname "$INPUT")" OUTPUT="$OUTDIR/$STEM.txt"

--- Refuse overwrite ---

if [[ -f "$OUTPUT" ]]; then echo "Error: Output file already exists:" echo " $OUTPUT" echo "Aborting to avoid overwrite." exit 1 fi

Prefer venv whisper if installed via install script

WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}" WHISPER_BIN="whisper" if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then WHISPER_BIN="$WHISPER_VENV/bin/whisper" fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi

TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT

echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}"

ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False )

if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi

--- Final move (no overwrite possible due to earlier check) ---

mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:" echo " $OUTPUT"


History — 4 versions shown

Changes

From 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:16:26) hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
From 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that... […]
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
From 2025-12-23 11:26:07 (discovered: 2026-02-05 14:24:03) hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
To 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes Ã¥ ä ö) å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"

Versions

  1. 2025-12-23 11:26:07
    Discovered: 2026-04-24 08:16:26 Hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  2. 2025-12-23 11:26:07
    Discovered: 2026-04-24 08:14:23 Hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  3. 2025-12-23 11:26:07
    Discovered: 2026-03-19 13:50:20 Hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  4. 2025-12-23 11:26:07
    Discovered: 2026-02-05 14:24:03 Hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"

The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Permalink
Published: 2025-12-23 11:26:07
Discovered: 2026-04-24 08:14:23
Author: 1
Hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
https://www.tornevalls.se/the-struggle-transcribe-stuff-for-free-with-whisper-and-wsl-linux-with-a-gtx-1060/
Description

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat

@echo off setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul

REM File passed from Explorer set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper "%WSL_FILE%""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe"

[HKEY_CLASSES_ROOT*\shell\WhisperWSL\command] @=""F:\viktigt\Private\Linux-Scripts\Whisper.bat" "%1""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install"

--- Parse args ---

while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done

echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE"

--- Sanity ---

if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi

shellcheck disable=SC1090

source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

==================================================

UNINSTALL MODE (-u)

==================================================

if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)"

pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true

echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi

==================================================

INSTALL MODE (DEFAULT)

==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install
numpy==1.26.4
torch==1.13.1+cu116
torchvision==0.14.1+cu116
torchaudio==0.13.1
--extra-index-url https://download.pytorch.org/whl/cu116

--- Verify ---

echo "==> Verifying environment" python - << 'EOF' import torch, numpy print("Torch:", torch.version) print("NumPy:", numpy.version) print("CUDA available:", torch.cuda.is_available()) if torch.cuda.is_available(): print("GPU:", torch.cuda.get_device_name(0)) print("Capability:", torch.cuda.get_device_capability(0)) EOF

echo "" echo "Done." echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash set -euo pipefail

whisper-run.sh

Usage:

whisper <input.extension> [model] [language]

Output:

.txt (same directory)

Behaviour:

- Refuses to overwrite existing .txt

- Stops execution if output exists

if [[ $# -lt 1 ]]; then echo "Usage: whisper <input.extension> [model] [language]" exit 1 fi

INPUT="$1" MODEL="${2:-small}" LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then echo "Error: Input file not found: $INPUT" exit 1 fi

BASENAME="$(basename "$INPUT")" STEM="${BASENAME%.*}" OUTDIR="$(dirname "$INPUT")" OUTPUT="$OUTDIR/$STEM.txt"

--- Refuse overwrite ---

if [[ -f "$OUTPUT" ]]; then echo "Error: Output file already exists:" echo " $OUTPUT" echo "Aborting to avoid overwrite." exit 1 fi

Prefer venv whisper if installed via install script

WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}" WHISPER_BIN="whisper" if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then WHISPER_BIN="$WHISPER_VENV/bin/whisper" fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi

TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT

echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}"

ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False )

if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi

--- Final move (no overwrite possible due to earlier check) ---

mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:" echo " $OUTPUT"


History — 4 versions shown

Changes

From 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:16:26) hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
From 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that... […]
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
From 2025-12-23 11:26:07 (discovered: 2026-02-05 14:24:03) hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
To 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes Ã¥ ä ö) å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"

Versions

  1. 2025-12-23 11:26:07
    Discovered: 2026-04-24 08:16:26 Hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  2. 2025-12-23 11:26:07
    Discovered: 2026-04-24 08:14:23 Hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  3. 2025-12-23 11:26:07
    Discovered: 2026-03-19 13:50:20 Hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  4. 2025-12-23 11:26:07
    Discovered: 2026-02-05 14:24:03 Hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"

The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Permalink
Published: 2025-12-23 11:26:07
Discovered: 2026-03-19 13:50:20
Author: 1
Hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
https://www.tornevalls.se/the-struggle-transcribe-stuff-for-free-with-whisper-and-wsl-linux-with-a-gtx-1060/
Description

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat

@echo off setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul

REM File passed from Explorer set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper "%WSL_FILE%""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe"

[HKEY_CLASSES_ROOT*\shell\WhisperWSL\command] @=""F:\viktigt\Private\Linux-Scripts\Whisper.bat" "%1""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install"

--- Parse args ---

while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done

echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE"

--- Sanity ---

if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi

shellcheck disable=SC1090

source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

==================================================

UNINSTALL MODE (-u)

==================================================

if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)"

pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true

echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi

==================================================

INSTALL MODE (DEFAULT)

==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install
numpy==1.26.4
torch==1.13.1+cu116
torchvision==0.14.1+cu116
torchaudio==0.13.1
--extra-index-url https://download.pytorch.org/whl/cu116

--- Verify ---

echo "==> Verifying environment" python - << 'EOF' import torch, numpy print("Torch:", torch.version) print("NumPy:", numpy.version) print("CUDA available:", torch.cuda.is_available()) if torch.cuda.is_available(): print("GPU:", torch.cuda.get_device_name(0)) print("Capability:", torch.cuda.get_device_capability(0)) EOF

echo "" echo "Done." echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash set -euo pipefail

whisper-run.sh

Usage:

whisper <input.extension> [model] [language]

Output:

.txt (same directory)

Behaviour:

- Refuses to overwrite existing .txt

- Stops execution if output exists

if [[ $# -lt 1 ]]; then echo "Usage: whisper <input.extension> [model] [language]" exit 1 fi

INPUT="$1" MODEL="${2:-small}" LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then echo "Error: Input file not found: $INPUT" exit 1 fi

BASENAME="$(basename "$INPUT")" STEM="${BASENAME%.*}" OUTDIR="$(dirname "$INPUT")" OUTPUT="$OUTDIR/$STEM.txt"

--- Refuse overwrite ---

if [[ -f "$OUTPUT" ]]; then echo "Error: Output file already exists:" echo " $OUTPUT" echo "Aborting to avoid overwrite." exit 1 fi

Prefer venv whisper if installed via install script

WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}" WHISPER_BIN="whisper" if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then WHISPER_BIN="$WHISPER_VENV/bin/whisper" fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi

TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT

echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}"

ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False )

if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi

--- Final move (no overwrite possible due to earlier check) ---

mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:" echo " $OUTPUT"


History — 4 versions shown

Changes

From 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:16:26) hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
From 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that... […]
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
From 2025-12-23 11:26:07 (discovered: 2026-02-05 14:24:03) hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
To 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes Ã¥ ä ö) å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"

Versions

  1. 2025-12-23 11:26:07
    Discovered: 2026-04-24 08:16:26 Hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  2. 2025-12-23 11:26:07
    Discovered: 2026-04-24 08:14:23 Hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  3. 2025-12-23 11:26:07
    Discovered: 2026-03-19 13:50:20 Hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  4. 2025-12-23 11:26:07
    Discovered: 2026-02-05 14:24:03 Hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"

The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

Permalink
Published: 2025-12-23 11:26:07
Discovered: 2026-02-05 14:24:03
Author: 1
Hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
https://www.tornevalls.se/the-struggle-transcribe-stuff-for-free-with-whisper-and-wsl-linux-with-a-gtx-1060/
Description

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

Content

I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat

@echo off setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul

REM File passed from Explorer set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper "%WSL_FILE%""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe"

[HKEY_CLASSES_ROOT*\shell\WhisperWSL\command] @=""F:\viktigt\Private\Linux-Scripts\Whisper.bat" "%1""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install"

--- Parse args ---

while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done

echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE"

--- Sanity ---

if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi

shellcheck disable=SC1090

source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

==================================================

UNINSTALL MODE (-u)

==================================================

if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)"

pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true

echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi

==================================================

INSTALL MODE (DEFAULT)

==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install
numpy==1.26.4
torch==1.13.1+cu116
torchvision==0.14.1+cu116
torchaudio==0.13.1
--extra-index-url https://download.pytorch.org/whl/cu116

--- Verify ---

echo "==> Verifying environment" python - << 'EOF' import torch, numpy print("Torch:", torch.version) print("NumPy:", numpy.version) print("CUDA available:", torch.cuda.is_available()) if torch.cuda.is_available(): print("GPU:", torch.cuda.get_device_name(0)) print("Capability:", torch.cuda.get_device_capability(0)) EOF

echo "" echo "Done." echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash set -euo pipefail

whisper-run.sh

Usage:

whisper <input.extension> [model] [language]

Output:

.txt (same directory)

Behaviour:

- Refuses to overwrite existing .txt

- Stops execution if output exists

if [[ $# -lt 1 ]]; then echo "Usage: whisper <input.extension> [model] [language]" exit 1 fi

INPUT="$1" MODEL="${2:-small}" LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then echo "Error: Input file not found: $INPUT" exit 1 fi

BASENAME="$(basename "$INPUT")" STEM="${BASENAME%.*}" OUTDIR="$(dirname "$INPUT")" OUTPUT="$OUTDIR/$STEM.txt"

--- Refuse overwrite ---

if [[ -f "$OUTPUT" ]]; then echo "Error: Output file already exists:" echo " $OUTPUT" echo "Aborting to avoid overwrite." exit 1 fi

Prefer venv whisper if installed via install script

WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}" WHISPER_BIN="whisper" if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then WHISPER_BIN="$WHISPER_VENV/bin/whisper" fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi

TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT

echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}"

ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False )

if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi

--- Final move (no overwrite possible due to earlier check) ---

mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:" echo " $OUTPUT"


History — 4 versions shown

Changes

From 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:16:26) hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
From 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
To 2025-12-23 11:26:07 (discovered: 2026-04-24 08:14:23) hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that... […]
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
From 2025-12-23 11:26:07 (discovered: 2026-02-05 14:24:03) hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
To 2025-12-23 11:26:07 (discovered: 2026-03-19 13:50:20) hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
Title
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
Description
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
Content
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app. Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized. At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re you’re expected to pay quite a bit just to keep going. I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself. The end result was the following (thanks to ChatGPT): A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well. A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file. A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click. A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names. The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in. WSL uses python and pip… Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat @echo off setlocal EnableExtensions REM Force UTF-8 codepage (fixes Ã¥ ä ö) å ä ö) chcp 65001 >nul REM File passed from Explorer set "WIN_FILE=%~1" REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i" REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\"" endlocal whisper.reg (explorer right clicks) Windows Registry Editor Version 5.00 [HKEY_CLASSES_ROOT\*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe" [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command] @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\"" installer för för WSL/Linux (with 1060-compatibilty and pre-uninstaller) To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts. #!/usr/bin/env bash set -euo pipefail VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install" # --- Parse args --- while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE" # --- Sanity --- if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi # shellcheck disable=SC1090 source "$VENV_DIR/bin/activate" python -m pip install --upgrade pip setuptools wheel # ================================================== # UNINSTALL MODE (-u) # ================================================== if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)" pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi # ================================================== # INSTALL MODE (DEFAULT) # ================================================== echo "==> Installing compatible stack (no forced uninstall)" pip install \ numpy==1.26.4 \ torch==1.13.1+cu116 \ torchvision==0.14.1+cu116 \ torchaudio==0.13.1 \ --extra-index-url https://download.pytorch.org/whl/cu116 # --- Verify --- echo "==> Verifying environment" python - /dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}" ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False ) if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi "$WHISPER_BIN" "${ARGS[@]}" GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi # --- Final
Old vs new
From
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"
To
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060

DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...

CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

The end result was the following (thanks to ChatGPT):

A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

WSL uses python and pip…

Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat

@echo off
setlocal EnableExtensions

REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul

REM File passed from Explorer
set "WIN_FILE=%~1"

REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

endlocal

whisper.reg (explorer right clicks)

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"

[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

#!/usr/bin/env bash
set -euo pipefail

VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"

# --- Parse args ---
while getopts ":u" opt; do
 case "$opt" in
 u) MODE="uninstall" ;;
 *)
 echo "Usage: $0 [-u]"
 exit 1
 ;;
 esac
done

echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"

# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
 echo "Error: venv not found: $VENV_DIR"
 exit 1
fi

# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"

python -m pip install --upgrade pip setuptools wheel

# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
 echo "==> Uninstalling incompatible packages ONLY (-u)"

 pip uninstall -y torch torchvision torchaudio || true
 pip uninstall -y numpy || true

 echo ""
 echo "Done."
 echo "Uninstall completed. Nothing else touched."
 exit 0
fi

# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================

echo "==> Installing compatible stack (no forced uninstall)"

pip install \
 numpy==1.26.4 \
 torch==1.13.1+cu116 \
 torchvision==0.14.1+cu116 \
 torchaudio==0.13.1 \
 --extra-index-url https://download.pytorch.org/whl/cu116

# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
 print("GPU:", torch.cuda.get_device_name(0))
 print("Capability:", torch.cuda.get_device_capability(0))
EOF

echo ""
echo "Done."
echo "Install completed without destructive actions."

The script itself

The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

#!/usr/bin/env bash
set -euo pipefail

# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists

if [[ $# -lt 1 ]]; then
 echo "Usage: whisper <input.extension> [model] [language]"
 exit 1
fi

INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"

if [[ ! -f "$INPUT" ]]; then
 echo "Error: Input file not found: $INPUT"
 exit 1
fi

BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"

# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
 echo "Error: Output file already exists:"
 echo " $OUTPUT"
 echo "Aborting to avoid overwrite."
 exit 1
fi

# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
 WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi

if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
 echo "Error: whisper not found in PATH or venv."
 exit 1
fi

TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT

echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"

ARGS=(
 "$INPUT"
 --model "$MODEL"
 --output_dir "$TMPDIR"
 --output_format txt
 --task transcribe
 --verbose False
 --fp16 False
)

if [[ -n "$LANGUAGE" ]]; then
 ARGS+=( --language "$LANGUAGE" )
fi

"$WHISPER_BIN" "${ARGS[@]}"

GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
 FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
 if [[ -z "${FOUND_TXT:-}" ]]; then
 echo "Error: No .txt output produced."
 exit 1
 fi
 GENERATED_TXT="$FOUND_TXT"
fi

# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"

echo "==> Done:"
echo " $OUTPUT"

Versions

  1. 2025-12-23 11:26:07
    Discovered: 2026-04-24 08:16:26 Hash: 7048054bb2d73799a6f2563ca0267e8a302b4ff0
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  2. 2025-12-23 11:26:07
    Discovered: 2026-04-24 08:14:23 Hash: 16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  3. 2025-12-23 11:26:07
    Discovered: 2026-03-19 13:50:20 Hash: b0fbb9c4287dd26aa452f1adc93e224e681051e1
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"
  4. 2025-12-23 11:26:07
    Discovered: 2026-02-05 14:24:03 Hash: 30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
    Title:
    The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
    Description:
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
    Content
    I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.

    I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.

    Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.

    At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.

    I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.

    The end result was the following (thanks to ChatGPT):

    A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.

    A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.

    A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.

    A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.

    The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.

    WSL uses python and pip…

    Table of Contents
    Toggle
    whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
    whisper.bat

    @echo off
    setlocal EnableExtensions

    REM Force UTF-8 codepage (fixes å ä ö)
    chcp 65001 >nul

    REM File passed from Explorer
    set "WIN_FILE=%~1"

    REM Convert Windows path to WSL path (UTF-8 safe now)
    for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"

    REM Run whisper on that file
    wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""

    endlocal

    whisper.reg (explorer right clicks)

    Windows Registry Editor Version 5.00

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
    @="Transkribera med Whisper (WSL)"
    "Icon"="wsl.exe"

    [HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
    @="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""

    installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)

    To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.

    #!/usr/bin/env bash
    set -euo pipefail

    VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
    MODE="install"

    # --- Parse args ---
    while getopts ":u" opt; do
    case "$opt" in
    u) MODE="uninstall" ;;
    *)
    echo "Usage: $0 [-u]"
    exit 1
    ;;
    esac
    done

    echo "==> Whisper installer (GTX 1060 compatible)"
    echo "==> Mode: $MODE"

    # --- Sanity ---
    if [[ ! -d "$VENV_DIR" ]]; then
    echo "Error: venv not found: $VENV_DIR"
    exit 1
    fi

    # shellcheck disable=SC1090
    source "$VENV_DIR/bin/activate"

    python -m pip install --upgrade pip setuptools wheel

    # ==================================================
    # UNINSTALL MODE (-u)
    # ==================================================
    if [[ "$MODE" == "uninstall" ]]; then
    echo "==> Uninstalling incompatible packages ONLY (-u)"

    pip uninstall -y torch torchvision torchaudio || true
    pip uninstall -y numpy || true

    echo ""
    echo "Done."
    echo "Uninstall completed. Nothing else touched."
    exit 0
    fi

    # ==================================================
    # INSTALL MODE (DEFAULT)
    # ==================================================

    echo "==> Installing compatible stack (no forced uninstall)"

    pip install \
    numpy==1.26.4 \
    torch==1.13.1+cu116 \
    torchvision==0.14.1+cu116 \
    torchaudio==0.13.1 \
    --extra-index-url https://download.pytorch.org/whl/cu116

    # --- Verify ---
    echo "==> Verifying environment"
    python - << 'EOF'
    import torch, numpy
    print("Torch:", torch.__version__)
    print("NumPy:", numpy.__version__)
    print("CUDA available:", torch.cuda.is_available())
    if torch.cuda.is_available():
    print("GPU:", torch.cuda.get_device_name(0))
    print("Capability:", torch.cuda.get_device_capability(0))
    EOF

    echo ""
    echo "Done."
    echo "Install completed without destructive actions."

    The script itself

    The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).

    #!/usr/bin/env bash
    set -euo pipefail

    # whisper-run.sh
    # Usage:
    # whisper <input.extension> [model] [language]
    #
    # Output:
    # <input-filename>.txt (same directory)
    #
    # Behaviour:
    # - Refuses to overwrite existing .txt
    # - Stops execution if output exists

    if [[ $# -lt 1 ]]; then
    echo "Usage: whisper <input.extension> [model] [language]"
    exit 1
    fi

    INPUT="$1"
    MODEL="${2:-small}"
    LANGUAGE="${3:-}"

    if [[ ! -f "$INPUT" ]]; then
    echo "Error: Input file not found: $INPUT"
    exit 1
    fi

    BASENAME="$(basename "$INPUT")"
    STEM="${BASENAME%.*}"
    OUTDIR="$(dirname "$INPUT")"
    OUTPUT="$OUTDIR/$STEM.txt"

    # --- Refuse overwrite ---
    if [[ -f "$OUTPUT" ]]; then
    echo "Error: Output file already exists:"
    echo " $OUTPUT"
    echo "Aborting to avoid overwrite."
    exit 1
    fi

    # Prefer venv whisper if installed via install script
    WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
    WHISPER_BIN="whisper"
    if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
    WHISPER_BIN="$WHISPER_VENV/bin/whisper"
    fi

    if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
    echo "Error: whisper not found in PATH or venv."
    exit 1
    fi

    TMPDIR="$(mktemp -d)"
    cleanup() { rm -rf "$TMPDIR"; }
    trap cleanup EXIT

    echo "==> Transcribing:"
    echo " input: $INPUT"
    echo " output: $OUTPUT"
    echo " model: $MODEL"
    echo " lang: ${LANGUAGE:-auto}"

    ARGS=(
    "$INPUT"
    --model "$MODEL"
    --output_dir "$TMPDIR"
    --output_format txt
    --task transcribe
    --verbose False
    --fp16 False
    )

    if [[ -n "$LANGUAGE" ]]; then
    ARGS+=( --language "$LANGUAGE" )
    fi

    "$WHISPER_BIN" "${ARGS[@]}"

    GENERATED_TXT="$TMPDIR/$STEM.txt"
    if [[ ! -f "$GENERATED_TXT" ]]; then
    FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
    if [[ -z "${FOUND_TXT:-}" ]]; then
    echo "Error: No .txt output produced."
    exit 1
    fi
    GENERATED_TXT="$FOUND_TXT"
    fi

    # --- Final move (no overwrite possible due to earlier check) ---
    mv "$GENERATED_TXT" "$OUTPUT"

    echo "==> Done:"
    echo " $OUTPUT"

The main purpose of the tech house track “Magdalena”

Permalink
Published: 2025-12-13 14:49:36
Discovered: 2026-03-19 13:50:20
Author: 1
Hash: eac2578c9dbd2bfb4ea9a741c22c44621e74487d
https://www.tornevalls.se/the-main-purpose-of-the-tech-house-track-magdalena/
Description

The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

Content

This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.


History — 2 versions shown

Changes

From 2025-12-13 14:49:36 (discovered: 2026-02-05 14:24:03) hash: 1985fb7d9c92a64980bccca214eb4b47382281d4
To 2025-12-13 14:49:36 (discovered: 2026-03-19 13:50:20) hash: eac2578c9dbd2bfb4ea9a741c22c44621e74487d
Title
The main purpose of the tech house track “Magdalena”
Description
The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
Content
This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion. Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure. The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition. The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions. Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
Old vs new
From
TITLE:
The main purpose of the tech house track “Magdalena”

DESCRIPTION:
The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

CONTENT:
This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
To
TITLE:
The main purpose of the tech house track “Magdalena”

DESCRIPTION:
The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

CONTENT:
This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.

Versions

  1. 2025-12-13 14:49:36
    Discovered: 2026-03-19 13:50:20 Hash: eac2578c9dbd2bfb4ea9a741c22c44621e74487d
    Title:
    The main purpose of the tech house track “Magdalena”
    Description:
    The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
    Content
    This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

    Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

    The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

    The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

    Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

    Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
  2. 2025-12-13 14:49:36
    Discovered: 2026-02-05 14:24:03 Hash: 1985fb7d9c92a64980bccca214eb4b47382281d4
    Title:
    The main purpose of the tech house track “Magdalena”
    Description:
    The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
    Content
    This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

    Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

    The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

    The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

    Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

    Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.

The main purpose of the tech house track “Magdalena”

Permalink
Published: 2025-12-13 14:49:36
Discovered: 2026-02-05 14:24:03
Author: 1
Hash: 1985fb7d9c92a64980bccca214eb4b47382281d4
https://www.tornevalls.se/the-main-purpose-of-the-tech-house-track-magdalena/
Description

The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

Content

This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.


History — 2 versions shown

Changes

From 2025-12-13 14:49:36 (discovered: 2026-02-05 14:24:03) hash: 1985fb7d9c92a64980bccca214eb4b47382281d4
To 2025-12-13 14:49:36 (discovered: 2026-03-19 13:50:20) hash: eac2578c9dbd2bfb4ea9a741c22c44621e74487d
Title
The main purpose of the tech house track “Magdalena”
Description
The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
Content
This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion. Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure. The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition. The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions. Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
Old vs new
From
TITLE:
The main purpose of the tech house track “Magdalena”

DESCRIPTION:
The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

CONTENT:
This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
To
TITLE:
The main purpose of the tech house track “Magdalena”

DESCRIPTION:
The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

CONTENT:
This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.

Versions

  1. 2025-12-13 14:49:36
    Discovered: 2026-03-19 13:50:20 Hash: eac2578c9dbd2bfb4ea9a741c22c44621e74487d
    Title:
    The main purpose of the tech house track “Magdalena”
    Description:
    The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
    Content
    This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

    Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

    The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

    The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

    Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

    Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
  2. 2025-12-13 14:49:36
    Discovered: 2026-02-05 14:24:03 Hash: 1985fb7d9c92a64980bccca214eb4b47382281d4
    Title:
    The main purpose of the tech house track “Magdalena”
    Description:
    The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
    Content
    This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.

    Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.

    The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.

    The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.

    Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.

    Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.

Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About

Permalink
Published: 2025-12-05 15:28:43
Discovered: 2026-04-24 08:16:26
Author: 1
Hash: bdde7b7698eb2daba3719ec4a81f56d716d5674a
https://www.tornevalls.se/things-prompt-pushers-thought-they-understood-about-daws-but-are-completely-utterly-wrong-about/
Description

Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an […]

Content

Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

Spoiler: they don’t. Not even close.

The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI!

A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

“But VSTs Use AI!”

Yes, some do – and that still doesn’t make your DAW AI!

This was the next brilliant argument thrown at me:

“Most of the components in a DAW use AI. Are you slow?”

First: No, they don’t.

Second: Calling people slow and other random words doesn’t magically make your argument correct.

Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

The Real Issue: AI Musicians Who Don’t Understand Music Tools

This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

have never mixed a track manually,

never aligned vocals without an AI tool,

never programmed automation by hand,

never learned gain staging,

never rendered or layered anything intentionally,

and absolutely never used a DAW beyond dragging stems into the timeline.

Yet they lecture others on “how audio production really works”.

And when someone challenges their nonsense, they fire off buzzwords like:

“you refuse to be educated”

“you’re a luddite”

“DAWs used AI for decades!”

“it’s the same thing!”

No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

Why This Matters

The problem isn’t people using AI.

The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

AI is powerful. It’s useful – but it doesn’t replace understanding.

If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

You’re just wrong. And loudly so.

If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

Learn what your tools are.

Learn what your tools are not.

Stop claiming everything with buttons and soundwaves is AI.

Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.


History — 4 versions shown

Changes

From 2025-12-05 15:28:43 (discovered: 2026-04-24 08:14:23) hash: d376ec321e6bb84aed112ef4528c71b7545d616b
To 2025-12-05 15:28:43 (discovered: 2026-04-24 08:16:26) hash: bdde7b7698eb2daba3719ec4a81f56d716d5674a
Title
Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About
Description
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example my personal favorite came from an […]
Content
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now. This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works. Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI! A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment. It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do. AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s. But apparently, for some people on the internet, everything becomes AI if you squint hard enough. That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”. The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode). It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research. “But VSTs Use AI!” Yes, some do – and that still doesn’t make your DAW AI! This was the next brilliant argument thrown at me: “Most of the components in a DAW use AI. Are you slow?” First: No, they don’t. Second: Calling people slow and other random words doesn’t magically make your argument correct. Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics. Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either. The Real Issue: AI Musicians Who Don’t Understand Music Tools This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who: have never mixed a track manually, never aligned vocals without an AI tool, never programmed automation by hand, never learned gain staging, never rendered or layered anything intentionally, and absolutely never used a DAW beyond dragging stems into the timeline. Yet they lecture others on “how audio production really works”. And when someone challenges their nonsense, they fire off buzzwords like: “you refuse to be educated” “you’re a luddite” “DAWs used AI for decades!” “it’s the same thing!” No. It’s not. And
Old vs new
From
TITLE:
Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About

DESCRIPTION:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared […]

CONTENT:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

Spoiler: they don’t. Not even close.

The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

Table of Contents
Toggle
A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
A DAW IS NOT AI – IT WILL NEVER BE AI!

A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

“But VSTs Use AI!”

Yes, some do – and that still doesn’t make your DAW AI!

This was the next brilliant argument thrown at me:

“Most of the components in a DAW use AI. Are you slow?”

First: No, they don’t.

Second: Calling people slow and other random words doesn’t magically make your argument correct.

Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

The Real Issue: AI Musicians Who Don’t Understand Music Tools

This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

have never mixed a track manually,

never aligned vocals without an AI tool,

never programmed automation by hand,

never learned gain staging,

never rendered or layered anything intentionally,

and absolutely never used a DAW beyond dragging stems into the timeline.

Yet they lecture others on “how audio production really works”.

And when someone challenges their nonsense, they fire off buzzwords like:

“you refuse to be educated”

“you’re a luddite”

“DAWs used AI for decades!”

“it’s the same thing!”

No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

Why This Matters

The problem isn’t people using AI.

The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

AI is powerful. It’s useful – but it doesn’t replace understanding.

If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

You’re just wrong. And loudly so.

If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

Learn what your tools are.

Learn what your tools are not.

Stop claiming everything with buttons and soundwaves is AI.

Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
To
TITLE:
Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About

DESCRIPTION:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an […]

CONTENT:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

Spoiler: they don’t. Not even close.

The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

Table of Contents
Toggle
A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
A DAW IS NOT AI – IT WILL NEVER BE AI!

A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

“But VSTs Use AI!”

Yes, some do – and that still doesn’t make your DAW AI!

This was the next brilliant argument thrown at me:

“Most of the components in a DAW use AI. Are you slow?”

First: No, they don’t.

Second: Calling people slow and other random words doesn’t magically make your argument correct.

Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

The Real Issue: AI Musicians Who Don’t Understand Music Tools

This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

have never mixed a track manually,

never aligned vocals without an AI tool,

never programmed automation by hand,

never learned gain staging,

never rendered or layered anything intentionally,

and absolutely never used a DAW beyond dragging stems into the timeline.

Yet they lecture others on “how audio production really works”.

And when someone challenges their nonsense, they fire off buzzwords like:

“you refuse to be educated”

“you’re a luddite”

“DAWs used AI for decades!”

“it’s the same thing!”

No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

Why This Matters

The problem isn’t people using AI.

The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

AI is powerful. It’s useful – but it doesn’t replace understanding.

If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

You’re just wrong. And loudly so.

If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

Learn what your tools are.

Learn what your tools are not.

Stop claiming everything with buttons and soundwaves is AI.

Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
From 2025-12-05 15:28:43 (discovered: 2026-03-19 13:50:20) hash: c700d9cdac72c1b650d96e4c5992db0e4d20c2a1
To 2025-12-05 15:28:43 (discovered: 2026-04-24 08:14:23) hash: d376ec321e6bb84aed112ef4528c71b7545d616b
Title
Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About
Description
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they... […]
Content
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now. This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works. Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI! A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment. It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do. AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s. But apparently, for some people on the internet, everything becomes AI if you squint hard enough. That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”. The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode). It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research. “But VSTs Use AI!” Yes, some do – and that still doesn’t make your DAW AI! This was the next brilliant argument thrown at me: “Most of the components in a DAW use AI. Are you slow?” First: No, they don’t. Second: Calling people slow and other random words doesn’t magically make your argument correct. Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics. Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either. The Real Issue: AI Musicians Who Don’t Understand Music Tools This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who: have never mixed a track manually, never aligned vocals without an AI tool, never programmed automation by hand, never learned gain staging, never rendered or layered anything intentionally, and absolutely never used a DAW beyond dragging stems into the timeline. Yet they lecture others on “how audio production really works”. And when someone challenges their nonsense, they fire off buzzwords like: “you refuse to be educated” “you’re a luddite” “DAWs used AI for decades!” “it’s the same thing!” No. It’s not. And
Old vs new
From
TITLE:
Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About

DESCRIPTION:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they...

CONTENT:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

Spoiler: they don’t. Not even close.

The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

Table of Contents
Toggle
A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
A DAW IS NOT AI – IT WILL NEVER BE AI!

A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

“But VSTs Use AI!”

Yes, some do – and that still doesn’t make your DAW AI!

This was the next brilliant argument thrown at me:

“Most of the components in a DAW use AI. Are you slow?”

First: No, they don’t.

Second: Calling people slow and other random words doesn’t magically make your argument correct.

Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

The Real Issue: AI Musicians Who Don’t Understand Music Tools

This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

have never mixed a track manually,

never aligned vocals without an AI tool,

never programmed automation by hand,

never learned gain staging,

never rendered or layered anything intentionally,

and absolutely never used a DAW beyond dragging stems into the timeline.

Yet they lecture others on “how audio production really works”.

And when someone challenges their nonsense, they fire off buzzwords like:

“you refuse to be educated”

“you’re a luddite”

“DAWs used AI for decades!”

“it’s the same thing!”

No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

Why This Matters

The problem isn’t people using AI.

The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

AI is powerful. It’s useful – but it doesn’t replace understanding.

If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

You’re just wrong. And loudly so.

If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

Learn what your tools are.

Learn what your tools are not.

Stop claiming everything with buttons and soundwaves is AI.

Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
To
TITLE:
Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About

DESCRIPTION:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared […]

CONTENT:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

Spoiler: they don’t. Not even close.

The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

Table of Contents
Toggle
A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
A DAW IS NOT AI – IT WILL NEVER BE AI!

A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

“But VSTs Use AI!”

Yes, some do – and that still doesn’t make your DAW AI!

This was the next brilliant argument thrown at me:

“Most of the components in a DAW use AI. Are you slow?”

First: No, they don’t.

Second: Calling people slow and other random words doesn’t magically make your argument correct.

Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

The Real Issue: AI Musicians Who Don’t Understand Music Tools

This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

have never mixed a track manually,

never aligned vocals without an AI tool,

never programmed automation by hand,

never learned gain staging,

never rendered or layered anything intentionally,

and absolutely never used a DAW beyond dragging stems into the timeline.

Yet they lecture others on “how audio production really works”.

And when someone challenges their nonsense, they fire off buzzwords like:

“you refuse to be educated”

“you’re a luddite”

“DAWs used AI for decades!”

“it’s the same thing!”

No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

Why This Matters

The problem isn’t people using AI.

The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

AI is powerful. It’s useful – but it doesn’t replace understanding.

If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

You’re just wrong. And loudly so.

If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

Learn what your tools are.

Learn what your tools are not.

Stop claiming everything with buttons and soundwaves is AI.

Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
From 2025-12-05 15:28:43 (discovered: 2026-02-05 14:24:03) hash: 3e81c80ea7c802eb7b125427806c792985432bbb
To 2025-12-05 15:28:43 (discovered: 2026-03-19 13:50:20) hash: c700d9cdac72c1b650d96e4c5992db0e4d20c2a1
Title
Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About
Description
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they...
Content
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now. This is the level we’re we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works. Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI! A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment. It doesn’t doesn’t think. It doesn’t doesn’t predict. It doesn’t doesn’t learn. It doesn’t doesn’t hallucinate answers because you ask stupid questions. It doesn’t doesn’t care what you want. It does exactly what you tell it to do. AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s. But apparently, for some people on the internet, everything becomes AI if you squint hard enough. That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”. The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode). It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research. “But VSTs Use AI!” Yes, some do – and that still doesn’t make your DAW AI! This was the next brilliant argument thrown at me: “Most of the components in a DAW use AI. Are you slow?” First: No, they don’t. don’t. Second: Calling people slow and other random words doesn’t doesn’t magically make your argument correct. Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics. Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t doesn’t become AI if you heat a smart thermometer inside it either. The Real Issue: AI Musicians Who Don’t Understand Music Tools This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re they’re supposedly replacing. There’s There’s a growing crowd of prompt-pushers who: have never mixed a track manually, never aligned vocals without an AI tool, never programmed automation by hand, never learned gain staging, never rendered or layered anything intentionally, and absolutely never used a DAW beyond dragging stems into the timeline. Yet they lecture others on “how audio production really works”. And when someone challenges their nonsense, they fire off buzzwords like: “you refuse to be educated” “you’re “you’re a luddite” “DAWs used AI for decades!” “it’s “it’s the same thing!” No. It’s It’s not. And
Old vs new
From
TITLE:
Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About

DESCRIPTION:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they...

CONTENT:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

Spoiler: they don’t. Not even close.

The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

Table of Contents
Toggle
A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
A DAW IS NOT AI – IT WILL NEVER BE AI!

A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

“But VSTs Use AI!”

Yes, some do – and that still doesn’t make your DAW AI!

This was the next brilliant argument thrown at me:

“Most of the components in a DAW use AI. Are you slow?”

First: No, they don’t.

Second: Calling people slow and other random words doesn’t magically make your argument correct.

Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

The Real Issue: AI Musicians Who Don’t Understand Music Tools

This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

have never mixed a track manually,

never aligned vocals without an AI tool,

never programmed automation by hand,

never learned gain staging,

never rendered or layered anything intentionally,

and absolutely never used a DAW beyond dragging stems into the timeline.

Yet they lecture others on “how audio production really works”.

And when someone challenges their nonsense, they fire off buzzwords like:

“you refuse to be educated”

“you’re a luddite”

“DAWs used AI for decades!”

“it’s the same thing!”

No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

Why This Matters

The problem isn’t people using AI.

The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

AI is powerful. It’s useful – but it doesn’t replace understanding.

If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

You’re just wrong. And loudly so.

If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

Learn what your tools are.

Learn what your tools are not.

Stop claiming everything with buttons and soundwaves is AI.

Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
To
TITLE:
Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About

DESCRIPTION:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they...

CONTENT:
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

Spoiler: they don’t. Not even close.

The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

Table of Contents
Toggle
A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
A DAW IS NOT AI – IT WILL NEVER BE AI!

A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

“But VSTs Use AI!”

Yes, some do – and that still doesn’t make your DAW AI!

This was the next brilliant argument thrown at me:

“Most of the components in a DAW use AI. Are you slow?”

First: No, they don’t.

Second: Calling people slow and other random words doesn’t magically make your argument correct.

Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

The Real Issue: AI Musicians Who Don’t Understand Music Tools

This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

have never mixed a track manually,

never aligned vocals without an AI tool,

never programmed automation by hand,

never learned gain staging,

never rendered or layered anything intentionally,

and absolutely never used a DAW beyond dragging stems into the timeline.

Yet they lecture others on “how audio production really works”.

And when someone challenges their nonsense, they fire off buzzwords like:

“you refuse to be educated”

“you’re a luddite”

“DAWs used AI for decades!”

“it’s the same thing!”

No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

Why This Matters

The problem isn’t people using AI.

The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

AI is powerful. It’s useful – but it doesn’t replace understanding.

If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

You’re just wrong. And loudly so.

If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

Learn what your tools are.

Learn what your tools are not.

Stop claiming everything with buttons and soundwaves is AI.

Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.

Versions

  1. 2025-12-05 15:28:43
    Discovered: 2026-04-24 08:16:26 Hash: bdde7b7698eb2daba3719ec4a81f56d716d5674a
    Title:
    Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About
    Description:
    Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an […]
    Content
    Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

    Spoiler: they don’t. Not even close.

    The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

    This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

    Table of Contents
    Toggle
    A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
    A DAW IS NOT AI – IT WILL NEVER BE AI!

    A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

    It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

    AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

    But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

    That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

    The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

    It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

    “But VSTs Use AI!”

    Yes, some do – and that still doesn’t make your DAW AI!

    This was the next brilliant argument thrown at me:

    “Most of the components in a DAW use AI. Are you slow?”

    First: No, they don’t.

    Second: Calling people slow and other random words doesn’t magically make your argument correct.

    Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

    Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

    The Real Issue: AI Musicians Who Don’t Understand Music Tools

    This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

    have never mixed a track manually,

    never aligned vocals without an AI tool,

    never programmed automation by hand,

    never learned gain staging,

    never rendered or layered anything intentionally,

    and absolutely never used a DAW beyond dragging stems into the timeline.

    Yet they lecture others on “how audio production really works”.

    And when someone challenges their nonsense, they fire off buzzwords like:

    “you refuse to be educated”

    “you’re a luddite”

    “DAWs used AI for decades!”

    “it’s the same thing!”

    No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

    Why This Matters

    The problem isn’t people using AI.

    The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

    AI is powerful. It’s useful – but it doesn’t replace understanding.

    If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

    You’re just wrong. And loudly so.

    If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

    Learn what your tools are.

    Learn what your tools are not.

    Stop claiming everything with buttons and soundwaves is AI.

    Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
  2. 2025-12-05 15:28:43
    Discovered: 2026-04-24 08:14:23 Hash: d376ec321e6bb84aed112ef4528c71b7545d616b
    Title:
    Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About
    Description:
    Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared […]
    Content
    Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

    Spoiler: they don’t. Not even close.

    The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

    This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

    Table of Contents
    Toggle
    A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
    A DAW IS NOT AI – IT WILL NEVER BE AI!

    A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

    It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

    AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

    But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

    That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

    The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

    It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

    “But VSTs Use AI!”

    Yes, some do – and that still doesn’t make your DAW AI!

    This was the next brilliant argument thrown at me:

    “Most of the components in a DAW use AI. Are you slow?”

    First: No, they don’t.

    Second: Calling people slow and other random words doesn’t magically make your argument correct.

    Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

    Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

    The Real Issue: AI Musicians Who Don’t Understand Music Tools

    This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

    have never mixed a track manually,

    never aligned vocals without an AI tool,

    never programmed automation by hand,

    never learned gain staging,

    never rendered or layered anything intentionally,

    and absolutely never used a DAW beyond dragging stems into the timeline.

    Yet they lecture others on “how audio production really works”.

    And when someone challenges their nonsense, they fire off buzzwords like:

    “you refuse to be educated”

    “you’re a luddite”

    “DAWs used AI for decades!”

    “it’s the same thing!”

    No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

    Why This Matters

    The problem isn’t people using AI.

    The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

    AI is powerful. It’s useful – but it doesn’t replace understanding.

    If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

    You’re just wrong. And loudly so.

    If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

    Learn what your tools are.

    Learn what your tools are not.

    Stop claiming everything with buttons and soundwaves is AI.

    Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
  3. 2025-12-05 15:28:43
    Discovered: 2026-03-19 13:50:20 Hash: c700d9cdac72c1b650d96e4c5992db0e4d20c2a1
    Title:
    Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About
    Description:
    Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they...
    Content
    Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

    Spoiler: they don’t. Not even close.

    The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

    This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

    Table of Contents
    Toggle
    A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
    A DAW IS NOT AI – IT WILL NEVER BE AI!

    A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

    It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

    AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

    But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

    That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

    The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

    It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

    “But VSTs Use AI!”

    Yes, some do – and that still doesn’t make your DAW AI!

    This was the next brilliant argument thrown at me:

    “Most of the components in a DAW use AI. Are you slow?”

    First: No, they don’t.

    Second: Calling people slow and other random words doesn’t magically make your argument correct.

    Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

    Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

    The Real Issue: AI Musicians Who Don’t Understand Music Tools

    This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

    have never mixed a track manually,

    never aligned vocals without an AI tool,

    never programmed automation by hand,

    never learned gain staging,

    never rendered or layered anything intentionally,

    and absolutely never used a DAW beyond dragging stems into the timeline.

    Yet they lecture others on “how audio production really works”.

    And when someone challenges their nonsense, they fire off buzzwords like:

    “you refuse to be educated”

    “you’re a luddite”

    “DAWs used AI for decades!”

    “it’s the same thing!”

    No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

    Why This Matters

    The problem isn’t people using AI.

    The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

    AI is powerful. It’s useful – but it doesn’t replace understanding.

    If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

    You’re just wrong. And loudly so.

    If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

    Learn what your tools are.

    Learn what your tools are not.

    Stop claiming everything with buttons and soundwaves is AI.

    Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
  4. 2025-12-05 15:28:43
    Discovered: 2026-02-05 14:24:03 Hash: 3e81c80ea7c802eb7b125427806c792985432bbb
    Title:
    Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About
    Description:
    Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they...
    Content
    Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.

    Spoiler: they don’t. Not even close.

    The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.

    This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.

    Table of Contents
    Toggle
    A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters
    A DAW IS NOT AI – IT WILL NEVER BE AI!

    A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.

    It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.

    AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.

    But apparently, for some people on the internet, everything becomes AI if you squint hard enough.

    That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.

    The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).

    It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.

    “But VSTs Use AI!”

    Yes, some do – and that still doesn’t make your DAW AI!

    This was the next brilliant argument thrown at me:

    “Most of the components in a DAW use AI. Are you slow?”

    First: No, they don’t.

    Second: Calling people slow and other random words doesn’t magically make your argument correct.

    Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.

    Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.

    The Real Issue: AI Musicians Who Don’t Understand Music Tools

    This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:

    have never mixed a track manually,

    never aligned vocals without an AI tool,

    never programmed automation by hand,

    never learned gain staging,

    never rendered or layered anything intentionally,

    and absolutely never used a DAW beyond dragging stems into the timeline.

    Yet they lecture others on “how audio production really works”.

    And when someone challenges their nonsense, they fire off buzzwords like:

    “you refuse to be educated”

    “you’re a luddite”

    “DAWs used AI for decades!”

    “it’s the same thing!”

    No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.

    Why This Matters

    The problem isn’t people using AI.

    The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.

    AI is powerful. It’s useful – but it doesn’t replace understanding.

    If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.

    You’re just wrong. And loudly so.

    If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:

    Learn what your tools are.

    Learn what your tools are not.

    Stop claiming everything with buttons and soundwaves is AI.

    Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.