Log in to subscribe to heads-up notifications for this feed or its category via email, Slack, or Discord.
Ask the AI anything about content, patterns, and edits for Tornevalls Blog. The AI will receive full version history including all edited articles. Open question history.
ebc4881ceeb7fba08d75edb6c73fd894dce57b22Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […]
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.
That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.
I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.
Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening
The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.
At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.
Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.
Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.
Why the “Suno will die” narrative keeps showing up
This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.
First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.
Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.
Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.
The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.
Latest claims I have seen in that thread
Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”
What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.
What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.
So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.
Claim: “A settlement will force a ‘clean model’ and kill creativity”
What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.
What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.
Claim: “You don’t own anything, you are renting, and your catalog can vanish”
This is the part where people accidentally become correct, but for the wrong reasons.
Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.
Contract reality matters: the ToS is designed to give the platform broad rights and broad control.
The practical takeaway is simple and non-dramatic:
Back up your WAVs/stems and project notes locally. Always.
Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”
Possible: yes, as a policy choice.
Inevitable: no.
Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.
So what are the real risks for Suno?
Think in terms of business incentives.
High probability changes
When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.
Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.
Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.
Medium probability changes
It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.
In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.
Lower probability, but still worth planning for
There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.
A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.
What about us who actually do the work?
Here is the split that regulation will make clearer over time.
If you actually create something
If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.
At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.
If you do nothing and just press generate
This is where it all goes to shit, and yes, this is exactly where regulation is needed.
When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.
So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.
What you should do right now
This does not require panic. It does require using your head.
Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.
That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.
Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.
So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
ebc4881ceeb7fba08d75edb6c73fd894dce57b22
2f7e4a0d8c75dd7792918888f337748908ecb6f5
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […] CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […] CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
696e004598696c9b32ec7879894bc14619881ea9
ebc4881ceeb7fba08d75edb6c73fd894dce57b22
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […] CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
e7e71a61897ed20875ea43e501419311b91b35b1
696e004598696c9b32ec7879894bc14619881ea9
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
2f7e4a0d8c75dd7792918888f337748908ecb6f5
ebc4881ceeb7fba08d75edb6c73fd894dce57b22
696e004598696c9b32ec7879894bc14619881ea9
e7e71a61897ed20875ea43e501419311b91b35b1
696e004598696c9b32ec7879894bc14619881ea9Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.
That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.
I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.
Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening
The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.
At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.
Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.
Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.
Why the “Suno will die” narrative keeps showing up
This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.
First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.
Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.
Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.
The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.
Latest claims I have seen in that thread
Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”
What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.
What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.
So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.
Claim: “A settlement will force a ‘clean model’ and kill creativity”
What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.
What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.
Claim: “You don’t own anything, you are renting, and your catalog can vanish”
This is the part where people accidentally become correct, but for the wrong reasons.
Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.
Contract reality matters: the ToS is designed to give the platform broad rights and broad control.
The practical takeaway is simple and non-dramatic:
Back up your WAVs/stems and project notes locally. Always.
Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”
Possible: yes, as a policy choice.
Inevitable: no.
Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.
So what are the real risks for Suno?
Think in terms of business incentives.
High probability changes
When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.
Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.
Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.
Medium probability changes
It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.
In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.
Lower probability, but still worth planning for
There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.
A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.
What about us who actually do the work?
Here is the split that regulation will make clearer over time.
If you actually create something
If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.
At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.
If you do nothing and just press generate
This is where it all goes to shit, and yes, this is exactly where regulation is needed.
When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.
So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.
What you should do right now
This does not require panic. It does require using your head.
Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.
That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.
Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.
So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
ebc4881ceeb7fba08d75edb6c73fd894dce57b22
2f7e4a0d8c75dd7792918888f337748908ecb6f5
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […] CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […] CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
696e004598696c9b32ec7879894bc14619881ea9
ebc4881ceeb7fba08d75edb6c73fd894dce57b22
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […] CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
e7e71a61897ed20875ea43e501419311b91b35b1
696e004598696c9b32ec7879894bc14619881ea9
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
2f7e4a0d8c75dd7792918888f337748908ecb6f5
ebc4881ceeb7fba08d75edb6c73fd894dce57b22
696e004598696c9b32ec7879894bc14619881ea9
e7e71a61897ed20875ea43e501419311b91b35b1
e7e71a61897ed20875ea43e501419311b91b35b1Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main...
Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”.
That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”.
I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”.
Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening
The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs.
At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026.
Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled.
Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away.
Why the “Suno will die” narrative keeps showing up
This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story.
First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy.
Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage.
Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering.
The first two elements are grounded in reality. The third is usually narrative-building rather than evidence.
Latest claims I have seen in that thread
Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”
What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs.
What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist.
So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law.
Claim: “A settlement will force a ‘clean model’ and kill creativity”
What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns.
What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large.
Claim: “You don’t own anything, you are renting, and your catalog can vanish”
This is the part where people accidentally become correct, but for the wrong reasons.
Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down.
Contract reality matters: the ToS is designed to give the platform broad rights and broad control.
The practical takeaway is simple and non-dramatic:
Back up your WAVs/stems and project notes locally. Always.
Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”
Possible: yes, as a policy choice.
Inevitable: no.
Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome.
So what are the real risks for Suno?
Think in terms of business incentives.
High probability changes
When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely.
Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits.
Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits.
Medium probability changes
It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes.
In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk.
Lower probability, but still worth planning for
There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported.
A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable.
What about us who actually do the work?
Here is the split that regulation will make clearer over time.
If you actually create something
If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas.
At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership.
If you do nothing and just press generate
This is where it all goes to shit, and yes, this is exactly where regulation is needed.
When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else.
So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine.
What you should do right now
This does not require panic. It does require using your head.
Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework.
That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved.
Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook.
So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
ebc4881ceeb7fba08d75edb6c73fd894dce57b22
2f7e4a0d8c75dd7792918888f337748908ecb6f5
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […] CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward […] CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
696e004598696c9b32ec7879894bc14619881ea9
ebc4881ceeb7fba08d75edb6c73fd894dce57b22
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music […] CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
e7e71a61897ed20875ea43e501419311b91b35b1
696e004598696c9b32ec7879894bc14619881ea9
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
TITLE: People are predicting Suno’s death – how likely is it? DESCRIPTION: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main... CONTENT: Not long ago, Udio effectively took the “fine, we’ll license it” route: it reached a strategic agreement with Universal Music Group around a new licensed AI music platform planned for 2026. That came after the 2024 lawsuits became the main backdrop for the entire AI-music sector. In other words: this space is not moving toward “no regulation”. It is moving toward “pay for access, pay for rights, gate features, control the pipeline”. That context matters when people on Facebook start writing doom posts about Suno “going down” or getting “lobotomized”. I am not going to pretend the risk is zero. But the likely future is not “Suno dies”. The likely future is “Suno changes”. Table of Contents Toggle What is actually happeningWhy the “Suno will die” narrative keeps showing upLatest claims I have seen in that threadClaim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music”Claim: “A settlement will force a ‘clean model’ and kill creativity”Claim: “You don’t own anything, you are renting, and your catalog can vanish”Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky”So what are the real risks for Suno?High probability changesMedium probability changesLower probability, but still worth planning forWhat about us who actually do the work?If you actually create somethingIf you do nothing and just press generateWhat you should do right now What is actually happening The lawsuits are real. In June 2024, the major labels sued Suno and Udio for alleged copyright infringement tied to training data and generated outputs. At the same time, licensing deals are real as well. Udio has a publicly announced agreement with Universal Music Group aimed at building a licensed AI music creation platform, with a stated target around 2026. Suno is moving in the same direction. Warner Music Group has entered a licensing partnership with Suno that also points toward licensed models in 2026, along with changes to how downloads and access are handled. Taken together, this follows a familiar industry pattern: first comes litigation, then comes licensing, once it becomes clear that the technology itself is not going away. Why the “Suno will die” narrative keeps showing up This narrative tends to appear because many doom posts combine several different things and present them as a single, coherent story. First, there is a real legal problem. Lawsuits against AI-music platforms exist, and they are serious enough to affect business decisions and long-term strategy. Second, there is a real contract reality. Terms of Service are written to protect the platform, not the user’s sense of authorship or emotional investment, and they already allow broad control over access and usage. Third, a speculative causal chain is often added on top. A temporary outage is interpreted as legal panic, which then becomes secret audits, year-end reporting pressure, or a hidden rollout of copyright filtering. The first two elements are grounded in reality. The third is usually narrative-building rather than evidence. Latest claims I have seen in that thread Claim: “Suno was trained on all humanity’s music, therefore it will be forced into public domain and stock music” What holds: Licensing pressure is pushing companies toward licensed datasets and opt-in catalogs. What does not: “Licensed” does not automatically mean “Mozart only”. Licensed can mean modern catalogs, if the business deals exist. So the conclusion “it will become elevator music” is not a fact. It is a taste prediction dressed up as law. Claim: “A settlement will force a ‘clean model’ and kill creativity” What holds: Restrictions can reduce the model’s freedom to imitate specific mainstream patterns. What does not: Creativity is not a single knob called “trained illegally”. Plenty of music is great under constraints. Also, new licensed catalogs can still be large. Claim: “You don’t own anything, you are renting, and your catalog can vanish” This is the part where people accidentally become correct, but for the wrong reasons. Platform risk is real: any cloud service can change tiers, cap downloads, remove features, or even shut down. Contract reality matters: the ToS is designed to give the platform broad rights and broad control. The practical takeaway is simple and non-dramatic: Back up your WAVs/stems and project notes locally. Always. Claim: “Suno will retroactively lock or delete older songs because the old models are legally risky” Possible: yes, as a policy choice. Inevitable: no. Companies do sometimes quarantine “legacy” features. They also often keep them accessible to avoid user revolt. The honest position is: it is a risk, but not a guaranteed outcome. So what are the real risks for Suno? Think in terms of business incentives. High probability changes When licensed models are introduced, older models are likely to be phased out over time rather than supported indefinitely. Download rules are also likely to tighten. Free tiers may lose download rights entirely, while paid tiers may face caps or stricter limits. Pricing and credit structures are likely to change as well. Licensing is expensive, and those costs tend to be passed down to users through higher prices, fewer credits, or tighter usage limits. Medium probability changes It is also plausible that Suno will introduce stronger similarity or compliance checks. This would not necessarily resemble YouTube-style Content ID systems, but rather softer pressure aimed at avoiding the generation of obvious sound-alikes. In addition, restrictions on uploads may increase. This is particularly likely when users upload audio files specifically to steer or constrain generations, as that carries higher legal and licensing risk. Lower probability, but still worth planning for There is also a lower-probability risk that access to some legacy outputs could be removed retroactively, or that such material could be reclassified as non-commercial or unsupported. A complete shutdown of the service is unlikely, but it is never entirely impossible in any SaaS-based business and should not be treated as unthinkable. What about us who actually do the work? Here is the split that regulation will make clearer over time. If you actually create something If you write lyrics, arrange, edit, re-record, mix, master, and build something with intent, you can still treat Suno as a sketchpad, as a collaborator, and as a generator of stems and ideas. At the same time, you should behave accordingly. That means keeping source files and version history, documenting what you actually did in terms of lyrics, edits, arrangement choices, and post-production, and not assuming that a paid subscription automatically equals copyright ownership. If you do nothing and just press generate This is where it all goes to shit, and yes, this is exactly where regulation is needed. When people brag “I created and produced this” while doing absolutely nothing, they are not just annoying. They flood the platforms with garbage. Output turns into spam, quality drops, and the legal risk goes up. That is what forces companies to lock things down harder for everyone else. So my position is simple and not negotiable. Regulation is welcome, not to kill AI music, but to put clear lines around licensing, consent, attribution, and responsibility. And if you want credit for a piece of music, you need to have actually contributed something. Otherwise you are not a creator, you are just occupying space in a very loud machine. What you should do right now This does not require panic. It does require using your head. Read the agreement you are actually using. Not a Reddit summary, not a Facebook hot take, not even this article – but the Terms of Service as written. The copyright attorney whose video was linked in that thread has been issuing the same warnings for years, across multiple AI platforms. Her core message has always been the same: these services are built to protect the company first, and whatever rights you think you have only exist within that framework. That does not mean Suno is about to implode, or that your music will suddenly be deleted tomorrow. It does mean that copyright pressure will eventually collide with the current free-for-all, because it always does when enough money is involved. Whether that collision results in fines, settlements, licensing fees, tighter controls, or all of the above depends largely on how much capital Suno has set aside to absorb legal pressure, pay damages, or buy peace through licensing. None of that is visible to users, and none of it is decided by vibes on Facebook. So the reasonable position is boring but solid. Keep local copies of anything you care about. Treat Suno as a tool, not a vault. Assume rules will tighten over time. And stop confusing convenience with ownership.
2f7e4a0d8c75dd7792918888f337748908ecb6f5
ebc4881ceeb7fba08d75edb6c73fd894dce57b22
696e004598696c9b32ec7879894bc14619881ea9
e7e71a61897ed20875ea43e501419311b91b35b1
7048054bb2d73799a6f2563ca0267e8a302b4ff0I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat
@echo off setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul
REM File passed from Explorer set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper "%WSL_FILE%""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe"
[HKEY_CLASSES_ROOT*\shell\WhisperWSL\command] @=""F:\viktigt\Private\Linux-Scripts\Whisper.bat" "%1""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install"
while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done
echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE"
if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true
echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi
echo "==> Installing compatible stack (no forced uninstall)"
pip install
numpy==1.26.4
torch==1.13.1+cu116
torchvision==0.14.1+cu116
torchaudio==0.13.1
--extra-index-url https://download.pytorch.org/whl/cu116
echo "==> Verifying environment" python - << 'EOF' import torch, numpy print("Torch:", torch.version) print("NumPy:", numpy.version) print("CUDA available:", torch.cuda.is_available()) if torch.cuda.is_available(): print("GPU:", torch.cuda.get_device_name(0)) print("Capability:", torch.cuda.get_device_capability(0)) EOF
echo "" echo "Done." echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash set -euo pipefail
if [[ $# -lt 1 ]]; then echo "Usage: whisper <input.extension> [model] [language]" exit 1 fi
INPUT="$1" MODEL="${2:-small}" LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then echo "Error: Input file not found: $INPUT" exit 1 fi
BASENAME="$(basename "$INPUT")" STEM="${BASENAME%.*}" OUTDIR="$(dirname "$INPUT")" OUTPUT="$OUTDIR/$STEM.txt"
if [[ -f "$OUTPUT" ]]; then echo "Error: Output file already exists:" echo " $OUTPUT" echo "Aborting to avoid overwrite." exit 1 fi
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}" WHISPER_BIN="whisper" if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then WHISPER_BIN="$WHISPER_VENV/bin/whisper" fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi
TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT
echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}"
ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False )
if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:" echo " $OUTPUT"
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
7048054bb2d73799a6f2563ca0267e8a302b4ff0
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
b0fbb9c4287dd26aa452f1adc93e224e681051e1
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
b0fbb9c4287dd26aa452f1adc93e224e681051e1
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
7048054bb2d73799a6f2563ca0267e8a302b4ff0
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
b0fbb9c4287dd26aa452f1adc93e224e681051e1
30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6dI’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat
@echo off setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul
REM File passed from Explorer set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper "%WSL_FILE%""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe"
[HKEY_CLASSES_ROOT*\shell\WhisperWSL\command] @=""F:\viktigt\Private\Linux-Scripts\Whisper.bat" "%1""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install"
while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done
echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE"
if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true
echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi
echo "==> Installing compatible stack (no forced uninstall)"
pip install
numpy==1.26.4
torch==1.13.1+cu116
torchvision==0.14.1+cu116
torchaudio==0.13.1
--extra-index-url https://download.pytorch.org/whl/cu116
echo "==> Verifying environment" python - << 'EOF' import torch, numpy print("Torch:", torch.version) print("NumPy:", numpy.version) print("CUDA available:", torch.cuda.is_available()) if torch.cuda.is_available(): print("GPU:", torch.cuda.get_device_name(0)) print("Capability:", torch.cuda.get_device_capability(0)) EOF
echo "" echo "Done." echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash set -euo pipefail
if [[ $# -lt 1 ]]; then echo "Usage: whisper <input.extension> [model] [language]" exit 1 fi
INPUT="$1" MODEL="${2:-small}" LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then echo "Error: Input file not found: $INPUT" exit 1 fi
BASENAME="$(basename "$INPUT")" STEM="${BASENAME%.*}" OUTDIR="$(dirname "$INPUT")" OUTPUT="$OUTDIR/$STEM.txt"
if [[ -f "$OUTPUT" ]]; then echo "Error: Output file already exists:" echo " $OUTPUT" echo "Aborting to avoid overwrite." exit 1 fi
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}" WHISPER_BIN="whisper" if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then WHISPER_BIN="$WHISPER_VENV/bin/whisper" fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi
TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT
echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}"
ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False )
if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:" echo " $OUTPUT"
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
7048054bb2d73799a6f2563ca0267e8a302b4ff0
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
b0fbb9c4287dd26aa452f1adc93e224e681051e1
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
b0fbb9c4287dd26aa452f1adc93e224e681051e1
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
7048054bb2d73799a6f2563ca0267e8a302b4ff0
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
b0fbb9c4287dd26aa452f1adc93e224e681051e1
30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
b0fbb9c4287dd26aa452f1adc93e224e681051e1I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat
@echo off setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul
REM File passed from Explorer set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper "%WSL_FILE%""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe"
[HKEY_CLASSES_ROOT*\shell\WhisperWSL\command] @=""F:\viktigt\Private\Linux-Scripts\Whisper.bat" "%1""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install"
while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done
echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE"
if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true
echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi
echo "==> Installing compatible stack (no forced uninstall)"
pip install
numpy==1.26.4
torch==1.13.1+cu116
torchvision==0.14.1+cu116
torchaudio==0.13.1
--extra-index-url https://download.pytorch.org/whl/cu116
echo "==> Verifying environment" python - << 'EOF' import torch, numpy print("Torch:", torch.version) print("NumPy:", numpy.version) print("CUDA available:", torch.cuda.is_available()) if torch.cuda.is_available(): print("GPU:", torch.cuda.get_device_name(0)) print("Capability:", torch.cuda.get_device_capability(0)) EOF
echo "" echo "Done." echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash set -euo pipefail
if [[ $# -lt 1 ]]; then echo "Usage: whisper <input.extension> [model] [language]" exit 1 fi
INPUT="$1" MODEL="${2:-small}" LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then echo "Error: Input file not found: $INPUT" exit 1 fi
BASENAME="$(basename "$INPUT")" STEM="${BASENAME%.*}" OUTDIR="$(dirname "$INPUT")" OUTPUT="$OUTDIR/$STEM.txt"
if [[ -f "$OUTPUT" ]]; then echo "Error: Output file already exists:" echo " $OUTPUT" echo "Aborting to avoid overwrite." exit 1 fi
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}" WHISPER_BIN="whisper" if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then WHISPER_BIN="$WHISPER_VENV/bin/whisper" fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi
TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT
echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}"
ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False )
if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:" echo " $OUTPUT"
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
7048054bb2d73799a6f2563ca0267e8a302b4ff0
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
b0fbb9c4287dd26aa452f1adc93e224e681051e1
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
b0fbb9c4287dd26aa452f1adc93e224e681051e1
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
7048054bb2d73799a6f2563ca0267e8a302b4ff0
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
b0fbb9c4287dd26aa452f1adc93e224e681051e1
30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
30b1980e02b98f24cf08ff2a3b59ce922f5c1d2dI’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents Toggle whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself whisper.bat
@echo off setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö) chcp 65001 >nul
REM File passed from Explorer set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now) for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file wsl bash -lc "/usr/local/tornevall/whisper "%WSL_FILE%""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT*\shell\WhisperWSL] @="Transkribera med Whisper (WSL)" "Icon"="wsl.exe"
[HKEY_CLASSES_ROOT*\shell\WhisperWSL\command] @=""F:\viktigt\Private\Linux-Scripts\Whisper.bat" "%1""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}" MODE="install"
while getopts ":u" opt; do case "$opt" in u) MODE="uninstall" ;; *) echo "Usage: $0 [-u]" exit 1 ;; esac done
echo "==> Whisper installer (GTX 1060 compatible)" echo "==> Mode: $MODE"
if [[ ! -d "$VENV_DIR" ]]; then echo "Error: venv not found: $VENV_DIR" exit 1 fi
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
if [[ "$MODE" == "uninstall" ]]; then echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true pip uninstall -y numpy || true
echo "" echo "Done." echo "Uninstall completed. Nothing else touched." exit 0 fi
echo "==> Installing compatible stack (no forced uninstall)"
pip install
numpy==1.26.4
torch==1.13.1+cu116
torchvision==0.14.1+cu116
torchaudio==0.13.1
--extra-index-url https://download.pytorch.org/whl/cu116
echo "==> Verifying environment" python - << 'EOF' import torch, numpy print("Torch:", torch.version) print("NumPy:", numpy.version) print("CUDA available:", torch.cuda.is_available()) if torch.cuda.is_available(): print("GPU:", torch.cuda.get_device_name(0)) print("Capability:", torch.cuda.get_device_capability(0)) EOF
echo "" echo "Done." echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash set -euo pipefail
if [[ $# -lt 1 ]]; then echo "Usage: whisper <input.extension> [model] [language]" exit 1 fi
INPUT="$1" MODEL="${2:-small}" LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then echo "Error: Input file not found: $INPUT" exit 1 fi
BASENAME="$(basename "$INPUT")" STEM="${BASENAME%.*}" OUTDIR="$(dirname "$INPUT")" OUTPUT="$OUTDIR/$STEM.txt"
if [[ -f "$OUTPUT" ]]; then echo "Error: Output file already exists:" echo " $OUTPUT" echo "Aborting to avoid overwrite." exit 1 fi
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}" WHISPER_BIN="whisper" if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then WHISPER_BIN="$WHISPER_VENV/bin/whisper" fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then echo "Error: whisper not found in PATH or venv." exit 1 fi
TMPDIR="$(mktemp -d)" cleanup() { rm -rf "$TMPDIR"; } trap cleanup EXIT
echo "==> Transcribing:" echo " input: $INPUT" echo " output: $OUTPUT" echo " model: $MODEL" echo " lang: ${LANGUAGE:-auto}"
ARGS=( "$INPUT" --model "$MODEL" --output_dir "$TMPDIR" --output_format txt --task transcribe --verbose False --fp16 False )
if [[ -n "$LANGUAGE" ]]; then ARGS+=( --language "$LANGUAGE" ) fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt" if [[ ! -f "$GENERATED_TXT" ]]; then FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)" if [[ -z "${FOUND_TXT:-}" ]]; then echo "Error: No .txt output produced." exit 1 fi GENERATED_TXT="$FOUND_TXT" fi
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:" echo " $OUTPUT"
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
7048054bb2d73799a6f2563ca0267e8a302b4ff0
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc. I first found a Samsung app that could handle […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
b0fbb9c4287dd26aa452f1adc93e224e681051e1
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text […]
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
b0fbb9c4287dd26aa452f1adc93e224e681051e1
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
TITLE:
The Struggle: Transcribe stuff for free with Whisper and WSL/Linux – With a GTX 1060
DESCRIPTION:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that...
CONTENT:
I’ve been struggling with transcription issues for quite some time, for a variety of reasons. Examples: I need a text transcribed to be pasted into Suno, that only exists as a m4a-file (i.e. music, that sometimes has hardcoded subtitles that has to be manually transcribed). Etc.
I first found a Samsung app that could handle transcription, but it quickly became clear that it was limited to its own ecosystem. In practice, you could only transcribe audio that had been recorded inside that specific app.
Since then, I’ve been looking around on and off, and more recently I picked it up again as the need increased – partly to get correct transcriptions, but also to be able to process any audio files I download or record. Samsung’s app is decent, but the quality varies. Right after recording, it performs a quick transcription, but the result is noticeably worse than if you re-run the transcription once the audio file is fully finalized.
At that point I came across “Whisper Transcribe” for Windows. It works, but it requires an account and, of course, paid credits to continue transcribing. You get a small number of free credits at first, but once those run out, you’re expected to pay quite a bit just to keep going.
I already knew that there must be software capable of doing this completely locally. I had previously discovered that Whisper exists in an open-source form as well (I’m not even sure whether the Windows application actually builds on that or not). So today I decided to finally figure out how to do it properly myself.
The end result was the following (thanks to ChatGPT):
A Whisper installer for WSL/Linux, with explicit support for NVIDIA GTX 1060 – something newer Python libraries clearly no longer handle well.
A Whisper runner for WSL/Linux: run whisper <input-file> and get a .txt transcript generated from the audio file.
A Windows Registry file that allows transcription to be executed directly from Windows Explorer via right-click.
A batch file that bridges Windows and WSL so everything runs cleanly, including proper handling of spaces and non-ASCII characters in file names.
The result is a fully local, offline transcription setup that works on any audio file, without accounts, credits, or vendor lock-in.
WSL uses python and pip…
Table of Contents
Toggle
whisper.batwhisper.reg (explorer right clicks)installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)The script itself
whisper.bat
@echo off
setlocal EnableExtensions
REM Force UTF-8 codepage (fixes å ä ö)
chcp 65001 >nul
REM File passed from Explorer
set "WIN_FILE=%~1"
REM Convert Windows path to WSL path (UTF-8 safe now)
for /f "delims=" %%i in ('wsl wslpath "%WIN_FILE%"') do set "WSL_FILE=%%i"
REM Run whisper on that file
wsl bash -lc "/usr/local/tornevall/whisper \"%WSL_FILE%\""
endlocal
whisper.reg (explorer right clicks)
Windows Registry Editor Version 5.00
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL]
@="Transkribera med Whisper (WSL)"
"Icon"="wsl.exe"
[HKEY_CLASSES_ROOT\*\shell\WhisperWSL\command]
@="\"F:\\viktigt\\Private\\Linux-Scripts\\Whisper.bat\" \"%1\""
installer för WSL/Linux (with 1060-compatibilty and pre-uninstaller)
To make sure stuff are removed properly before reinstalling there is a -u switch for this in the script. In case you make it wrong the first time, this switch is there to make sure you can reinstall it a second time without conflicts.
#!/usr/bin/env bash
set -euo pipefail
VENV_DIR="${VENV_DIR:-$HOME/.venvs/whisper}"
MODE="install"
# --- Parse args ---
while getopts ":u" opt; do
case "$opt" in
u) MODE="uninstall" ;;
*)
echo "Usage: $0 [-u]"
exit 1
;;
esac
done
echo "==> Whisper installer (GTX 1060 compatible)"
echo "==> Mode: $MODE"
# --- Sanity ---
if [[ ! -d "$VENV_DIR" ]]; then
echo "Error: venv not found: $VENV_DIR"
exit 1
fi
# shellcheck disable=SC1090
source "$VENV_DIR/bin/activate"
python -m pip install --upgrade pip setuptools wheel
# ==================================================
# UNINSTALL MODE (-u)
# ==================================================
if [[ "$MODE" == "uninstall" ]]; then
echo "==> Uninstalling incompatible packages ONLY (-u)"
pip uninstall -y torch torchvision torchaudio || true
pip uninstall -y numpy || true
echo ""
echo "Done."
echo "Uninstall completed. Nothing else touched."
exit 0
fi
# ==================================================
# INSTALL MODE (DEFAULT)
# ==================================================
echo "==> Installing compatible stack (no forced uninstall)"
pip install \
numpy==1.26.4 \
torch==1.13.1+cu116 \
torchvision==0.14.1+cu116 \
torchaudio==0.13.1 \
--extra-index-url https://download.pytorch.org/whl/cu116
# --- Verify ---
echo "==> Verifying environment"
python - << 'EOF'
import torch, numpy
print("Torch:", torch.__version__)
print("NumPy:", numpy.__version__)
print("CUDA available:", torch.cuda.is_available())
if torch.cuda.is_available():
print("GPU:", torch.cuda.get_device_name(0))
print("Capability:", torch.cuda.get_device_capability(0))
EOF
echo ""
echo "Done."
echo "Install completed without destructive actions."
The script itself
The script can run without any switches – and only with the audio file intended to be transcribed (but as you can see, it can do a bit more).
#!/usr/bin/env bash
set -euo pipefail
# whisper-run.sh
# Usage:
# whisper <input.extension> [model] [language]
#
# Output:
# <input-filename>.txt (same directory)
#
# Behaviour:
# - Refuses to overwrite existing .txt
# - Stops execution if output exists
if [[ $# -lt 1 ]]; then
echo "Usage: whisper <input.extension> [model] [language]"
exit 1
fi
INPUT="$1"
MODEL="${2:-small}"
LANGUAGE="${3:-}"
if [[ ! -f "$INPUT" ]]; then
echo "Error: Input file not found: $INPUT"
exit 1
fi
BASENAME="$(basename "$INPUT")"
STEM="${BASENAME%.*}"
OUTDIR="$(dirname "$INPUT")"
OUTPUT="$OUTDIR/$STEM.txt"
# --- Refuse overwrite ---
if [[ -f "$OUTPUT" ]]; then
echo "Error: Output file already exists:"
echo " $OUTPUT"
echo "Aborting to avoid overwrite."
exit 1
fi
# Prefer venv whisper if installed via install script
WHISPER_VENV="${WHISPER_VENV:-$HOME/.venvs/whisper}"
WHISPER_BIN="whisper"
if [[ -x "$WHISPER_VENV/bin/whisper" ]]; then
WHISPER_BIN="$WHISPER_VENV/bin/whisper"
fi
if [[ "$WHISPER_BIN" == "whisper" ]] && ! command -v whisper >/dev/null 2>&1; then
echo "Error: whisper not found in PATH or venv."
exit 1
fi
TMPDIR="$(mktemp -d)"
cleanup() { rm -rf "$TMPDIR"; }
trap cleanup EXIT
echo "==> Transcribing:"
echo " input: $INPUT"
echo " output: $OUTPUT"
echo " model: $MODEL"
echo " lang: ${LANGUAGE:-auto}"
ARGS=(
"$INPUT"
--model "$MODEL"
--output_dir "$TMPDIR"
--output_format txt
--task transcribe
--verbose False
--fp16 False
)
if [[ -n "$LANGUAGE" ]]; then
ARGS+=( --language "$LANGUAGE" )
fi
"$WHISPER_BIN" "${ARGS[@]}"
GENERATED_TXT="$TMPDIR/$STEM.txt"
if [[ ! -f "$GENERATED_TXT" ]]; then
FOUND_TXT="$(find "$TMPDIR" -maxdepth 1 -type f -name "*.txt" | head -n 1 || true)"
if [[ -z "${FOUND_TXT:-}" ]]; then
echo "Error: No .txt output produced."
exit 1
fi
GENERATED_TXT="$FOUND_TXT"
fi
# --- Final move (no overwrite possible due to earlier check) ---
mv "$GENERATED_TXT" "$OUTPUT"
echo "==> Done:"
echo " $OUTPUT"
7048054bb2d73799a6f2563ca0267e8a302b4ff0
16a1cc3a9de52040624c9a9a5d778dc05d7aaf6d
b0fbb9c4287dd26aa452f1adc93e224e681051e1
30b1980e02b98f24cf08ff2a3b59ce922f5c1d2d
eac2578c9dbd2bfb4ea9a741c22c44621e74487dThe lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.
Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.
The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.
The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.
Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
1985fb7d9c92a64980bccca214eb4b47382281d4
eac2578c9dbd2bfb4ea9a741c22c44621e74487d
TITLE: The main purpose of the tech house track “Magdalena” DESCRIPTION: The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. CONTENT: This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion. Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure. The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition. The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions. Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
TITLE: The main purpose of the tech house track “Magdalena” DESCRIPTION: The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. CONTENT: This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion. Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure. The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition. The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions. Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
eac2578c9dbd2bfb4ea9a741c22c44621e74487d
1985fb7d9c92a64980bccca214eb4b47382281d4
1985fb7d9c92a64980bccca214eb4b47382281d4The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion.
Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure.
The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition.
The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging.
Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions.
Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
1985fb7d9c92a64980bccca214eb4b47382281d4
eac2578c9dbd2bfb4ea9a741c22c44621e74487d
TITLE: The main purpose of the tech house track “Magdalena” DESCRIPTION: The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. CONTENT: This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion. Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure. The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition. The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions. Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
TITLE: The main purpose of the tech house track “Magdalena” DESCRIPTION: The lyrics for “Magdalena” are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. CONTENT: This track was created as a response to fear-driven narratives surrounding social democracy, particularly how certain political ideas are framed through anxiety, exaggeration, and symbolic threat rather than concrete policy discussion. Magdalena refers to the chair of the Swedish Social Democratic Party. In Swedish political discourse, she has become a frequent target of hostility that often extends beyond specific decisions and instead focuses on what she represents. In the context of this track, Magdalena is therefore not portrayed as an individual, but used as a symbolic figure. The song deliberately frames her as a stabilizing presence. Repeated chants and cyclical structures emphasize continuity, direction, and resilience. Rather than engaging in debate or argumentation, the track contrasts abstract ideas of chaos and order through rhythm and repetition. The lyrics are intentionally written in another language to mask the literal meaning and shift focus away from local political rhetoric. This choice also supports the musical direction of the track, giving it a more exotic Latin tech house character and allowing the vocal elements to function as texture and energy rather than explicit messaging. Any harsher expressions in the lyrics are directed at abstract authoritarian or fear-based mindsets, not at individuals or groups. The intention is not provocation, but to challenge narratives built on demonization and simplified oppositions. Overall, the track operates on a symbolic level. It responds to political fear by reframing social democracy as structure rather than disorder, stability rather than threat, and continuity rather than chaos, while remaining grounded in club-oriented electronic music aesthetics rather than overt political commentary.
eac2578c9dbd2bfb4ea9a741c22c44621e74487d
1985fb7d9c92a64980bccca214eb4b47382281d4
bdde7b7698eb2daba3719ec4a81f56d716d5674aEvery now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an […]
Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate.
Spoiler: they don’t. Not even close.
The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now.
This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works.
Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI!
A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment.
It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do.
AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s.
But apparently, for some people on the internet, everything becomes AI if you squint hard enough.
That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”.
The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode).
It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research.
“But VSTs Use AI!”
Yes, some do – and that still doesn’t make your DAW AI!
This was the next brilliant argument thrown at me:
“Most of the components in a DAW use AI. Are you slow?”
First: No, they don’t.
Second: Calling people slow and other random words doesn’t magically make your argument correct.
Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics.
Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either.
The Real Issue: AI Musicians Who Don’t Understand Music Tools
This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who:
have never mixed a track manually,
never aligned vocals without an AI tool,
never programmed automation by hand,
never learned gain staging,
never rendered or layered anything intentionally,
and absolutely never used a DAW beyond dragging stems into the timeline.
Yet they lecture others on “how audio production really works”.
And when someone challenges their nonsense, they fire off buzzwords like:
“you refuse to be educated”
“you’re a luddite”
“DAWs used AI for decades!”
“it’s the same thing!”
No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation.
Why This Matters
The problem isn’t people using AI.
The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology.
AI is powerful. It’s useful – but it doesn’t replace understanding.
If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood.
You’re just wrong. And loudly so.
If you want to be taken seriously as a creator in this hybrid world of AI-assisted music:
Learn what your tools are.
Learn what your tools are not.
Stop claiming everything with buttons and soundwaves is AI.
Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
d376ec321e6bb84aed112ef4528c71b7545d616b
bdde7b7698eb2daba3719ec4a81f56d716d5674a
TITLE: Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About DESCRIPTION: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared […] CONTENT: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now. This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works. Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI! A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment. It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do. AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s. But apparently, for some people on the internet, everything becomes AI if you squint hard enough. That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”. The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode). It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research. “But VSTs Use AI!” Yes, some do – and that still doesn’t make your DAW AI! This was the next brilliant argument thrown at me: “Most of the components in a DAW use AI. Are you slow?” First: No, they don’t. Second: Calling people slow and other random words doesn’t magically make your argument correct. Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics. Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either. The Real Issue: AI Musicians Who Don’t Understand Music Tools This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who: have never mixed a track manually, never aligned vocals without an AI tool, never programmed automation by hand, never learned gain staging, never rendered or layered anything intentionally, and absolutely never used a DAW beyond dragging stems into the timeline. Yet they lecture others on “how audio production really works”. And when someone challenges their nonsense, they fire off buzzwords like: “you refuse to be educated” “you’re a luddite” “DAWs used AI for decades!” “it’s the same thing!” No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation. Why This Matters The problem isn’t people using AI. The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology. AI is powerful. It’s useful – but it doesn’t replace understanding. If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood. You’re just wrong. And loudly so. If you want to be taken seriously as a creator in this hybrid world of AI-assisted music: Learn what your tools are. Learn what your tools are not. Stop claiming everything with buttons and soundwaves is AI. Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
TITLE: Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About DESCRIPTION: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an […] CONTENT: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now. This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works. Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI! A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment. It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do. AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s. But apparently, for some people on the internet, everything becomes AI if you squint hard enough. That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”. The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode). It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research. “But VSTs Use AI!” Yes, some do – and that still doesn’t make your DAW AI! This was the next brilliant argument thrown at me: “Most of the components in a DAW use AI. Are you slow?” First: No, they don’t. Second: Calling people slow and other random words doesn’t magically make your argument correct. Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics. Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either. The Real Issue: AI Musicians Who Don’t Understand Music Tools This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who: have never mixed a track manually, never aligned vocals without an AI tool, never programmed automation by hand, never learned gain staging, never rendered or layered anything intentionally, and absolutely never used a DAW beyond dragging stems into the timeline. Yet they lecture others on “how audio production really works”. And when someone challenges their nonsense, they fire off buzzwords like: “you refuse to be educated” “you’re a luddite” “DAWs used AI for decades!” “it’s the same thing!” No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation. Why This Matters The problem isn’t people using AI. The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology. AI is powerful. It’s useful – but it doesn’t replace understanding. If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood. You’re just wrong. And loudly so. If you want to be taken seriously as a creator in this hybrid world of AI-assisted music: Learn what your tools are. Learn what your tools are not. Stop claiming everything with buttons and soundwaves is AI. Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
c700d9cdac72c1b650d96e4c5992db0e4d20c2a1
d376ec321e6bb84aed112ef4528c71b7545d616b
TITLE: Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About DESCRIPTION: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they... CONTENT: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now. This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works. Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI! A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment. It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do. AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s. But apparently, for some people on the internet, everything becomes AI if you squint hard enough. That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”. The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode). It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research. “But VSTs Use AI!” Yes, some do – and that still doesn’t make your DAW AI! This was the next brilliant argument thrown at me: “Most of the components in a DAW use AI. Are you slow?” First: No, they don’t. Second: Calling people slow and other random words doesn’t magically make your argument correct. Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics. Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either. The Real Issue: AI Musicians Who Don’t Understand Music Tools This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who: have never mixed a track manually, never aligned vocals without an AI tool, never programmed automation by hand, never learned gain staging, never rendered or layered anything intentionally, and absolutely never used a DAW beyond dragging stems into the timeline. Yet they lecture others on “how audio production really works”. And when someone challenges their nonsense, they fire off buzzwords like: “you refuse to be educated” “you’re a luddite” “DAWs used AI for decades!” “it’s the same thing!” No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation. Why This Matters The problem isn’t people using AI. The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology. AI is powerful. It’s useful – but it doesn’t replace understanding. If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood. You’re just wrong. And loudly so. If you want to be taken seriously as a creator in this hybrid world of AI-assisted music: Learn what your tools are. Learn what your tools are not. Stop claiming everything with buttons and soundwaves is AI. Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
TITLE: Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About DESCRIPTION: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared […] CONTENT: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now. This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works. Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI! A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment. It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do. AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s. But apparently, for some people on the internet, everything becomes AI if you squint hard enough. That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”. The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode). It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research. “But VSTs Use AI!” Yes, some do – and that still doesn’t make your DAW AI! This was the next brilliant argument thrown at me: “Most of the components in a DAW use AI. Are you slow?” First: No, they don’t. Second: Calling people slow and other random words doesn’t magically make your argument correct. Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics. Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either. The Real Issue: AI Musicians Who Don’t Understand Music Tools This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who: have never mixed a track manually, never aligned vocals without an AI tool, never programmed automation by hand, never learned gain staging, never rendered or layered anything intentionally, and absolutely never used a DAW beyond dragging stems into the timeline. Yet they lecture others on “how audio production really works”. And when someone challenges their nonsense, they fire off buzzwords like: “you refuse to be educated” “you’re a luddite” “DAWs used AI for decades!” “it’s the same thing!” No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation. Why This Matters The problem isn’t people using AI. The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology. AI is powerful. It’s useful – but it doesn’t replace understanding. If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood. You’re just wrong. And loudly so. If you want to be taken seriously as a creator in this hybrid world of AI-assisted music: Learn what your tools are. Learn what your tools are not. Stop claiming everything with buttons and soundwaves is AI. Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
3e81c80ea7c802eb7b125427806c792985432bbb
c700d9cdac72c1b650d96e4c5992db0e4d20c2a1
TITLE: Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About DESCRIPTION: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they... CONTENT: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now. This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works. Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI! A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment. It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do. AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s. But apparently, for some people on the internet, everything becomes AI if you squint hard enough. That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”. The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode). It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research. “But VSTs Use AI!” Yes, some do – and that still doesn’t make your DAW AI! This was the next brilliant argument thrown at me: “Most of the components in a DAW use AI. Are you slow?” First: No, they don’t. Second: Calling people slow and other random words doesn’t magically make your argument correct. Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics. Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either. The Real Issue: AI Musicians Who Don’t Understand Music Tools This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who: have never mixed a track manually, never aligned vocals without an AI tool, never programmed automation by hand, never learned gain staging, never rendered or layered anything intentionally, and absolutely never used a DAW beyond dragging stems into the timeline. Yet they lecture others on “how audio production really works”. And when someone challenges their nonsense, they fire off buzzwords like: “you refuse to be educated” “you’re a luddite” “DAWs used AI for decades!” “it’s the same thing!” No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation. Why This Matters The problem isn’t people using AI. The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology. AI is powerful. It’s useful – but it doesn’t replace understanding. If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood. You’re just wrong. And loudly so. If you want to be taken seriously as a creator in this hybrid world of AI-assisted music: Learn what your tools are. Learn what your tools are not. Stop claiming everything with buttons and soundwaves is AI. Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
TITLE: Things “Prompt Pushers” Thought They Understood About DAWs – But Are Completely, Utterly Wrong About DESCRIPTION: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they... CONTENT: Every now and then, especially in groups where “AI-generated music” is the hot topic, I encounter a parade of self-declared “AI creators” who seem absolutely convinced they understand how AI, DAWs, plugins, audio engines and production workflows operate. Spoiler: they don’t. Not even close. The latest example – my personal favorite – came from an exchange with a couple of AI-wannabe musicians who tried to argue that a DAW is a type of AI. Yes. A Digital Audio Workstation. According to them, pressing Record apparently counts as machine intelligence now. This is the level we’re dealing with – usually among Suno-wannabes and so-called Udio-idiots when we try to explain how DAWs works. Table of Contents Toggle A DAW IS NOT AI – IT WILL NEVER BE AI!“But VSTs Use AI!”Yes, some do – and that still doesn’t make your DAW AI!The Real Issue: AI Musicians Who Don’t Understand Music ToolsWhy This Matters A DAW IS NOT AI – IT WILL NEVER BE AI! A DAW is a workstation – software used to record, edit and produce audio, not a “thinking” system (see any basic DAW definition). A timeline. A mixer. A routing environment. It doesn’t think. It doesn’t predict. It doesn’t learn. It doesn’t hallucinate answers because you ask stupid questions. It doesn’t care what you want. It does exactly what you tell it to do. AI, meanwhile, is defined by its capacity to generate, classify, predict or reason based on trained data – a machine based system that infers from input to generate outputs like predictions, content or decisions (see the OECD and EU definitions of AI systems). That is machine learning, which is something completely different from a sequencer that has existed since the 1980s. But apparently, for some people on the internet, everything becomes AI if you squint hard enough. That ALSO includes the Google screenshot that was used to prove me wrong saying “Yes, DAWs use AI in a variety of ways to enhance music production”. The text used is not a technical definition, it is an AI generated summary from Google’s experimental AI Overviews and AI Mode in Search feature, which stitches together a generic answer to the vague prompt “does a DAW use AI in some way”... That “AI researcher” is by the way extremely unreliable as it very much works as ChatGPT in “quick response mode” and guessing (since users tend to hate wait for a correct answer) what it cannot cover by itself (unless you use Thinking mode). It talks about using AI powered plugins and features inside a DAW, not about the DAW itself magically becoming an AI system. Google itself describes these AI Overviews as AI generated “snapshots” that simply bundle key information with links to real sources, not as authoritative definitions of anything (see their own description). Treating that blurb as proof that “a DAW is AI” is like treating an ad banner as peer reviewed research. “But VSTs Use AI!” Yes, some do – and that still doesn’t make your DAW AI! This was the next brilliant argument thrown at me: “Most of the components in a DAW use AI. Are you slow?” First: No, they don’t. Second: Calling people slow and other random words doesn’t magically make your argument correct. Most plugins run on traditional DSP – decades-old mathematics based on manipulating digital samples, not on “learning” from data (audio DSP is literally just signal processing code, not AI). Compression, EQ, filtering, reverb, synthesis, modulation – none of that is AI. If you think a compressor is artificial intelligence, you need to revisit the basics. Some plugins do use machine learning for tasks like noise reduction or stem separation – for example real time denoisers trained on speech like VoiceGate or deep learning based noise reduction projects like DeepFilterNet. Fine. But your environment does not become AI just because a plugin inside it happens to use it. Your microwave doesn’t become AI if you heat a smart thermometer inside it either. The Real Issue: AI Musicians Who Don’t Understand Music Tools This whole conversation acutally exposed something deeper, which makes this topic interesting, and that is why I choose to highlight the idiocrazy: Many AI-first creators cannot explain – or even identify – the tools they’re supposedly replacing. There’s a growing crowd of prompt-pushers who: have never mixed a track manually, never aligned vocals without an AI tool, never programmed automation by hand, never learned gain staging, never rendered or layered anything intentionally, and absolutely never used a DAW beyond dragging stems into the timeline. Yet they lecture others on “how audio production really works”. And when someone challenges their nonsense, they fire off buzzwords like: “you refuse to be educated” “you’re a luddite” “DAWs used AI for decades!” “it’s the same thing!” No. It’s not. And calling someone a luddite doesn’t turn confusion into expertise. It just telegraphs desperation. Why This Matters The problem isn’t people using AI. The problem is people pretending that AI makes them instant audio engineers – and then attacking anyone who points out the difference between a tool and a technology. AI is powerful. It’s useful – but it doesn’t replace understanding. If you think a DAW is artificial intelligence, you’re not an innovator. You’re not “ahead of the curve”.You’re not misunderstood. You’re just wrong. And loudly so. If you want to be taken seriously as a creator in this hybrid world of AI-assisted music: Learn what your tools are. Learn what your tools are not. Stop claiming everything with buttons and soundwaves is AI. Because right now, the biggest challenge for AI-powered music isn’t the tech. It’s the users who don’t understand it.
bdde7b7698eb2daba3719ec4a81f56d716d5674a
d376ec321e6bb84aed112ef4528c71b7545d616b
c700d9cdac72c1b650d96e4c5992db0e4d20c2a1
3e81c80ea7c802eb7b125427806c792985432bbb