Difference between revisions of "Voice imitation algorithms"

From SI410
Jump to: navigation, search
Line 1: Line 1:
 
'''Voice imitation algorithms''' (also known as '''[https://en.wikipedia.org/wiki/Speech_synthesis Speech synthesis]'''<ref>https://thehill.com/opinion/cybersecurity/470826-perception-wont-be-reality-once-ai-can-manipulate-what-we-see</ref>) are a form of [https://en.wikipedia.org/wiki/Synthetic_media Synthetic Media], used to imitate human speech. They achieve this by using [https://en.wikipedia.org/wiki/Machine_learning machine learning] and [https://en.wikipedia.org/wiki/Artificial_intelligence artificial intelligence] techniques<ref>https://www.sciencedirect.com/science/article/pii/S0007681319301600?via%3Dihub</ref>.  
 
'''Voice imitation algorithms''' (also known as '''[https://en.wikipedia.org/wiki/Speech_synthesis Speech synthesis]'''<ref>https://thehill.com/opinion/cybersecurity/470826-perception-wont-be-reality-once-ai-can-manipulate-what-we-see</ref>) are a form of [https://en.wikipedia.org/wiki/Synthetic_media Synthetic Media], used to imitate human speech. They achieve this by using [https://en.wikipedia.org/wiki/Machine_learning machine learning] and [https://en.wikipedia.org/wiki/Artificial_intelligence artificial intelligence] techniques<ref>https://www.sciencedirect.com/science/article/pii/S0007681319301600?via%3Dihub</ref>.  
 +
 
== History ==
 
== History ==
The [https://en.wikipedia.org/wiki/Speak_%26_Spell_(toy) Speak and Spell] was originally introduced in 1978 by [https://en.wikipedia.org/wiki/Texas_Instruments Texas Instruments]. It featured a keyboard and a speech synthesizer, which was used to convert words that were typed onto the keyboard into synthesized audio that it played from it's speakers.  
+
===Commercial implementation===
 +
The [https://en.wikipedia.org/wiki/Speak_%26_Spell_(toy) Speak and Spell] was originally introduced in 1978 by [https://en.wikipedia.org/wiki/Texas_Instruments Texas Instruments]. It featured a keyboard and a speech synthesizer, which was used to convert words that were typed onto the keyboard into synthesized audio that it played from speakers.  
  
Lyrebird (also known as '''Lyrebirde AI''') was a Montreal based company founded in 2017 focused on speech synthesis and voice imitation.<ref>https://www.wired.com/brandlab/2018/10/lyrebird-uses-ai-find-artificial-voice/</ref> In 2019 it was acquired by Descript, an American company focused on [https://en.wikipedia.org/wiki/Audio_editing_software audio editing software], specifically tailored towards [https://en.wikipedia.org/wiki/Podcast podcast creators]<ref>https://www.businessinsider.com/groupon-founder-andrew-mason-new-startup-descript-detour-2017-12</ref> Lyrebird AI uses artificial intelligence and voice samples to replicate human speech
+
[https://www.descript.com/lyrebird-ai?source=lyrebird Lyrebird] (also known as '''Lyrebird AI''') was a Montreal based company founded in 2017 focused on speech synthesis and voice imitation.<ref>https://www.wired.com/brandlab/2018/10/lyrebird-uses-ai-find-artificial-voice/</ref> In 2019 it was acquired by Descript, an American company focused on [https://en.wikipedia.org/wiki/Audio_editing_software audio editing software], specifically tailored towards [https://en.wikipedia.org/wiki/Podcast podcast creators].<ref>https://www.businessinsider.com/groupon-founder-andrew-mason-new-startup-descript-detour-2017-12</ref> Lyrebird AI uses artificial intelligence and voice samples to accurately replicate human speech.
 +
 
 +
China-based [https://en.wikipedia.org/wiki/Technology_company technology company] [https://en.wikipedia.org/wiki/Baidu Baidu] has used [https://en.wikipedia.org/wiki/Artificial_neural_network neural networks] and [https://en.wikipedia.org/wiki/Deep_learning deep learning] to create accurate voice imitations from thousands of collected voice samples.<ref>https://www.technologyreview.com/f/610386/a-new-algorithm-can-mimic-your-voice-with-just-snippets-of-audio/</ref><ref>http://research.baidu.com/Blog/index-view?id=91</ref>
  
  
 
== radnom ==
 
== radnom ==
  
 
+
== radnom ==
 +
== radnom ==
  
 
Examples, Lyrebird AI
 
Examples, Lyrebird AI
  
 
*References
 
*References

Revision as of 18:26, 13 March 2020

Voice imitation algorithms (also known as Speech synthesis[1]) are a form of Synthetic Media, used to imitate human speech. They achieve this by using machine learning and artificial intelligence techniques[2].

History

Commercial implementation

The Speak and Spell was originally introduced in 1978 by Texas Instruments. It featured a keyboard and a speech synthesizer, which was used to convert words that were typed onto the keyboard into synthesized audio that it played from speakers.

Lyrebird (also known as Lyrebird AI) was a Montreal based company founded in 2017 focused on speech synthesis and voice imitation.[3] In 2019 it was acquired by Descript, an American company focused on audio editing software, specifically tailored towards podcast creators.[4] Lyrebird AI uses artificial intelligence and voice samples to accurately replicate human speech.

China-based technology company Baidu has used neural networks and deep learning to create accurate voice imitations from thousands of collected voice samples.[5][6]


radnom

radnom

radnom

Examples, Lyrebird AI

  • References
  • https://thehill.com/opinion/cybersecurity/470826-perception-wont-be-reality-once-ai-can-manipulate-what-we-see
  • https://www.sciencedirect.com/science/article/pii/S0007681319301600?via%3Dihub
  • https://www.wired.com/brandlab/2018/10/lyrebird-uses-ai-find-artificial-voice/
  • https://www.businessinsider.com/groupon-founder-andrew-mason-new-startup-descript-detour-2017-12
  • https://www.technologyreview.com/f/610386/a-new-algorithm-can-mimic-your-voice-with-just-snippets-of-audio/
  • http://research.baidu.com/Blog/index-view?id=91