Amazon Alexa (Amazon Echo)
The Alexa Voice System, commonly known as Alexa, is an intelligent personal assistant developed by American e-commerce company Amazon. The service is currently available on a variety of platforms, including smart phones and televisions, and dedicated hardware such as Amazon Echo. Alexa is similar in function to other "virtual assistants" such as Siri, Google Assistant, and Microsoft Cortana in that it responds to voice commands from a user to perform a wide array of tasks, from playing music to relaying the news to controlling other smart home devices.  The Amazon Echo, and similar devices such as Google Home, are among the first "smart home" assistants. The central idea of the smart home is to bring the Internet of Things into everyday, household objects in order to collect useful data and automate simple tasks. Alexa is intended to provide users with an easy-to-use interface for this data and those tasks. Far-field voice control lets Alexa hear you from across the room even when music is playing. All you have to do is to activate Alexa is talk to her using her name, the user must say a wake-word often being "Echo" or "Okay, Alexa" to issue a command or request.
- 1 History
- 2 Functions
- 3 Alexa Prize
- 4 Alexa Fund
- 5 Ethical Implications
- 5.1 Security
- 5.2 Privacy/Surveillance
- 5.3 Perfect Voyeurism?
- 5.4 Benton County, Arkansas Homicide Investigation
- 5.5 Voice Activated Device Calls 911 During Assault
- 5.6 Becoming A Robotic Society
- 5.7 Google Home vs. Burger King
- 5.8 The Use of Female/Women-Identified Voices as Artificial Intelligence Assistants
- 5.9 Ethical Skill Creation
- 5.10 Healthcare Skills
- 6 See Also
- 7 References
The development for Amazon Echo began in 2010, when Amazon executive Dave Limp believed the idea of Amazon Echo hinted at a new, unique experience but would require many iterations to be successful in the Smart Home market. Not to be confused with Alexa, Echo is the actual product and line of physical device that Amazon produces. Alexa, is the cloud-based voice service that not only runs on Echo devices but also third-party device manufacturers such as BMW, Sonos, and Bose. The biggest challenge faced by Amazon in the construction of the Amazon Echo was that there were no other products which served the same purpose on the market to serve as a model upon which the company could improve. Therefore, engineers and product designers at Amazon were tasked with inventing a unique and novel product. Other voice control systems, such as Apple's Siri, Google's Voice Search, and Microsoft's Kinect, were fundamentally different from Amazon's product as Amazon Echo did not have a screen for users to see voice input or to manually input commands.
For Amazon Echo to be competitive on the market, Amazon needed to create a built-in virtual assistant, Alexa, that was capable of responding to queries quickly and conversationally. After years of unsuccessful iterations, Amazon Alexa was able to thoughtfully respond to voice queries in less than 1.5 seconds on average, far faster than the response times of other competing voice-recognition technologies of the time. Amazon Echo was noted as one of the biggest hits in Amazon's history and made Amazon Alexa widely popular in November of 2014. 
Since the first introduction of the Amazon Echo in 2014, the Echo has gone through two design iterations reaching it's third generation in the Fall of 2018 which boasts of a better speaker and a sleeker design. Available in 40 different countries, and able to speak 5 different languages, Amazon Alexa's presence has grown significantly in the the past few years . Originally, the Amazon Echo was a smart speaker with voice assistant Alexa enabled. Echo devices have expanded to over 7 Echo devices who's functionality ranges from automotive purposes to providing better acoustics.
Amazon Echo is signaled when the wake word, set to default by the name "Alexa", is spoken. Users can program their own wake word to be used with the device if the person wishes not to say "Alexa". Once signaled by the wake word, Alexa responds via the speakers in an Alexa-compatible device, such as Amazon Echo. The audio that is spoken after the wake word, including less than a second of audio that came before the wake word, is sent to the Cloud. The ring on top of the Echo speaker will turn light blue to let the user know that the audio is being streamed to the Cloud. .
Amazon Alexa and Echo are compatible with the music services Amazon Music, Spotify, Pandora, TuneIn, and iHeartRadio. The Amazon Echo provides 360 degree omni-directional audio and contains a 2.5 inch woofer and 2.0 inch tweeter for deep bass and crisp high notes. Amazon Alexa can hear you from across the room with far-field voice recognition and can even hear you while music is playing from the unit. 
Amazon Alexa also supports Alexa skills, which are voice-driven capabilities that are designed to enhance the functionality of Alexa and Amazon devices. Alexa can answer questions that users ask her and refer them to outside resources. Alexa can also play music, set timers and alarms, create shopping lists, give news and sports updates, check calendars, get traffic information, order Amazon products, and perform many other functions to help a user in their day-to-day lives.  Moreover, Alexa offers a wide array of features as the technology is constantly being updated. On April 15, 2015, Amazon launched a home automation feature that allows Alexa to interact with devices, including WeMo, Philips Hue, Samsung SmartThings, Nest, ecobee, and others  With the free Alexa App on Fire OS, Android, iOS, and desktop browsers, you can easily setup and manage your Alexa devices. As of November 2016, Alexa Appstore had over 5,000 functions available compared to 1,000 functions in June 2016. The Alexa App is also where one can discover and enable third-party skills. Amazon and Google are in the process of building a library of Skills for their respective Voice Assistants, with Alexa having 23,758 skills, and Google Assistant having 1,001 skills as of June, 2018. The growth rate of Skills for Alexa is at 8%, while the growth rate of Skills for Google is at 42%, as measured in a 45-day period by Mi et. al. The utility of skills are broken into many different components, including Skill invocation, Skill interaction, rogue Skill mitigation, and Invocation confusion. Skill invocation is broken into implicit and explicit invocation. Explicit invocation occurs when a Skill is referred to by name, specifically. Such a Skill can be the name of a bank, or store. Implicit invocation occurs when a user describes a task without directly uttering the name of the Skill. Google Assistant identifies a skill, and analyzes whether the context of a conversation with a user is appropriate for the skill. Alexa supports this mode of invocation for specific types of skills. Skill Interaction is based on a VPA communicating with users based on an interaction model. This model allows the VPA to interpret voice requests into commands. In this process, a "wake-word" is used to trigger a phrase and skill invocation name. This would include sentences like "Hey Google, talk to personal chef," where "Hey Google" is the wake-word, "talk to" is the trigger phrase, and "personal chef" is the skill invocation name. The trigger phrase is provided by the VPA system, which includes common terms like "open", "ask", "tell", and "start." In defining skills, intents, and sample utterances map the user's voice inputs into interfaces of the skill, which are finally translated into actions. Linking sentences into intent involves a software developer specifying the sample utterances, which are a set of sentence templates that delineate every possible way that a user will likely speak to the skill. Within this interface, there are built-in intents in the model that define many utterances that will be utilized by the user. In the case where intents need to be prioritized, the developer has the ability to either add more intents, or simply specify default intents, in which case all user requests that match the specific skill will be pipelined to a single intent, as opposed to being subjected to the result of a search of all possible intents, within the system. The problem of Rogue Skill invocation is a problem that allows for exploitation of duplicated invocation names, and mistaken skill invocation. More points of weakness include design that does not allow for fluid flow between skills. Both Google Assistant and Amazon Alexa are only able to operate in one mode at a time, in which only a single skill executes at a time. After executing a skill, the skill must be stopped before another skill can be launched, which, while mainly is a downfall in user-friendliness, is also negatively effected in the exposure of the system to malicious attacks. Invocation confusion is a phenomenon that software developers at Google and Amazon are attempting to minimize, through testing invocation names and ensuring the skills can be launched with high success rates. Despite these efforts, at least presently, Mi et al was able to experimentally demonstrate that an adversary can intentionally induce confusion by using names similar to a target skill in order to trick a user into invoking an attack skill when opening a target. An example demonstrated is a user aiming at the "Capital One" Skill, who can be attacked by an agent who registers a skill "Capital Won," "Capitol One," or "Captain One," in which the user becomes less distinguishable particularly in the presence of noise due to limitations of modern speech recognition techniques. 
Alexa Skills Kit
The Alexa Skills Kit (ASK) is a collection of self-service APIs, tools, documentation, and code samples that enable designers and developers to create and publish skills to Alexa. ASK is free to use and Alexa skill developers can apply to receive promotional credits towards developing further Alexa skills. These skills can be downloaded for free via the Alexa app and tutorials are available for developers to learn how to build voice experiences for their new or existing applications. A new addition to the Alexa Skills Kit is the Smart Home Skill API that provides a set of built-in smart home capabilities. Examples of these capabilities include the ability to control lights, fans, switches, thermostats, garage doors, sprinklers, locks, and more.  The Smart Home Skill API taps into Amazon's standardized language model, relieving developers from building the voice interaction model for their smart home skill.
Voice communication and everyday language remain one of the ultimate challenges for artificial intelligence. Amazon continues to promote the advancement of natural language processing. On September 29, 2016, the Alexa Prize was announced, a $2.5 million university competition to advance conversational AI through voice. Teams made up of up to ten university students from all over the world competed and one of those teams will be selected to receive a $100,000 research grant as a stipend. The competition started on November 14, 2016 and runs until November 2017, with an award ceremony in Las Vegas, Nevada. The first competition is to create a socialbot, which entails creating a program in the Alexa Skill Kit that can converse in a coherent and engaging way with human users on popular topics for twenty minutes. 
The combined University of Washington Electrical Engineering team and Paul G. Allen School of Computer Science & Engineering team were the first place winners of the Alexa Prize and earned $500,000 for their efforts. Together, they developed a chat bot with the intent to immerse the user in a dynamic, realistic conversation and fundamentally alter the way users interface with various devices around the home and office. Their chatbot, called Sounding Board, stood out because it was designed to interpret human conversation in different ways such attitude, context, and personality. The University of Washington team supplemented this by letting Sounding Board gather information in real time and then make connections so that it would not only have interesting topics to talk about, they would also be relevant and up to date. 
To further advance voice technologies, Amazon announced a $100 million venture capital fund on June 25, 2015 to help spur innovation in three specific areas. The three areas are:
- Hardware products for inside the home, outside the home, or on-the-go
- Skills that deliver new abilities to Alexa-endabled devices
- New contributions to the science behind voice technology
To target these different segments, the Alexa Fund consists of three parts. This is to target start ups and entrepreneurs at the different stages of innovation.  The three parts are:
- A University fellowship program
- A Venture Capital investment arm
- An accelerator or incubator
Such a fund would allow Amazon to further their current products and allow greater development in the realms of text to speech, automatic speech recognition, and artificial intelligence.
Many ethical concerns regarding privacy and security have been raised about Alexa and other voice response systems. One of the greatest concerns is the device's ability to hear and record you when it is not actively being used. Another vast concern lies within the government and independent hackers' ability to use these features to gather information on users. These two concerns raise a lot of ethical implications that have become increasingly prevalent within the news.
Alexa has the ability to communicate with third-party services through the Alexa Skills Kit interface in order to carry out commands. The many possible functions which Alexa can be programmed to serve is a cause for concern for many. For instance, a third-party service which interfaces with an Amazon Echo could retain records of its users' Alexa usage -- complete with personally identifying information (PII) -- and subsequently make use of that data without the end-users' knowledge.
Many are also concerned that malicious people could hack into the system in order to eavesdrop and listen in on users in their homes. Amazon counters this concern by stating that the data that is sent from the device to Amazon's servers is encrypted, helping secure user information from potential hackers,  but this explanation does not account for software and firmware modifications on the Echo device itself. Instead, such precautions only preserve the security of the voice data in transit between the Echo and Alexa-powering Amazon servers.
Another privacy and security controversy was sparked when a San Diego area television station reported the story of a 6-year-old who accidentally ordered a $170 dollhouse and cookies via Alexa. The reporters presenting the story used the Echo's wake word frequently, mimicking the orders placed by the six year old, in the segment which prompted viewers' Alexa devices all over San Diego to attempt to place Amazon orders. 
Many Echo owners feel that the devices may intrude on their privacy due to its constant listening. This was the case when Alexa ordered a dollhouse by request of a unbeknownst 6-year-old. However, others feel that the parents and owners of Alexa are responsible for understanding the capabilities of the technology they are buying and are responsible for setting security measures accordingly. Alexa and the Echo do offer options that can prevent unnecessary purchases and owners can change the wake word if they so desire. While it is Amazon's duty to ensure security measures on its products, consumers are also responsible for understanding the technology they are buying and how they can modify the technologies to better suit their needs.
Though Amazon Alexa is activated when the wake word is spoken, the Amazon Echo is always listening. It is programmed to listen for the word "Alexa", so it is constantly analyzing every sentence it hears to make sure that specific word is not mentioned. This can be a problem, especially when television is involved. Alexa may respond to its own television ads playing on the user's television because it is unable to distinguish sound on the television and an actual person. In both cases, the information that follows the wake word is sent to Amazon's cloud servers to determine the correct response, but also to learn more about the user.  While Alexa is always listening, it is only sending the content that is said after it's signaled. After all, Alexa's role is a voice assistant and a huge part of how voice assistants function is to record your voice. This voice information is used to better personalize the user's experience; however, it is unclear how long this data is stored in Amazon's servers. 
Users have expressed their concerns about how much Alexa listens to them, and whether that affects user privacy or could potentially be used by law enforcement.  A spokesperson from Amazon stated that there must be a proper legal warrant for Amazon to give out customer information. Users can also delete voice recordings that were sent to Alexa by going to the History tab in the Settings section of their Alexa App; however, Amazon warns that by deleting voice recordings and interactions made with Alexa, the user experience may be affected. Another option given to users is to change their wake word to something more unique that won't be mistakenly said or picked up on.  Users have also expressed concerns that Amazon Alexa is listening to private conversations and storing the information. Amazon has responded to these concerns by stating that it only stores information after the "wake word" has been stated.  Sometimes in common conversation, the wake work can be stated thus activating Alexa to listen to the rest of the conversation regardless if turning on the device was intentional or not. This causes confusion for some customers who accidentally turn Alexa on by unknowingly using a wake word, and raises security concerns on how often Alexa is storing what they are saying.
It is worthy to note that when asked, "Are you connected to the CIA?" Alexa shuts off and ignores the question. This was videotaped and placed on Youtube. The situation was awfully creepy and the web took it by storm. This aided in the privacy, security, and surveillance issues that stem from Amazon Alexa. Many individuals are worried that their personal information is being relayed to others without their knowledge. Additionally, they are worried that their conversations are being recorded and potentially being sent to the CIA. This is a common concern as many individuals are worried that Alexa is constantly listening to conversations. The Amazon Alexa streams audio to the cloud where conversations are stored. A user's conversations are only deleted from the cloud when they choose to delete it however, it might degrade their future experiences.  Due to these features, Alexa recordings have turned into a controversial topic where customers are questioning the ethical implications behind a device that stores their personal conversations.
In March 2018, the Amazon Alexa was featured in a myriad of news sources concerning the reason behind a multiplicity of incidents in which the device was reported to have randomly laughed without request.  The alleged incidents of the spontaneous laughing made many users feel uncomfortable, particularly due to the nature of the laugh. It was unanimously described by many to have been "creepy" and "eerie" in nature. One man's recording of the incident, in which he asks Alexa to "play the last sound", went viral over Twitter. The situation raised questions and concerns of privacy, as many such devices sit in intimate spaces, waiting to prompted.
In response to these reports, Amazon came forward and explained that Alexa had been programmed to laugh at the command, "Alexa, laugh." Amazon explained that it was likely that the device misunderstood the audio in its proximity as the command to laugh, and responded accordingly. Amazon reported that it would create a system update in which Alexa would be prompted to laugh only at the command, "Alexa, can you laugh?", in an effort to reduce false positives from the device.
The capability of the device to consistently listen to users in anticipation of the wake word relates to Tony Doyle's notion of "Perfect Voyeurism". Doyle defines Perfect Voyeurism as "covert watching or listening that is neither discovered nor publicized" and Doyle argues that there is nothing wrong with this type of voyeurism  Prior to cases such as the dollhouse incident, this perfect voyeurism could be maintained. As users were likely oblivious to the device's eavesdropping capabilities. However now that the users have become aware of the listening, they have taken measures to avoid the devices persistent listening. As users are able to turn off the microphone on Amazon Echo or Echo Dot by pressing a button on the top of the device. This is consistent with Doyle's sentiment as he states "I maintain that, if detected or publicized, voyeurism can do grave harm" . This harm is evident in the users desire to avoid it. As the avoidance reduces functionality, when this button turns red, the device will not respond to its wake word or the action button until the microphone is turned on again.  Given the exposed surveillance, users will have to choose between the ease of functionality and reduced privacy associated with the surveillance.
Benton County, Arkansas Homicide Investigation
Amazon Alexa is concerning in regards to access and ownership of its voice recordings and audio files, especially when it comes to criminal investigation. During a homicide investigation in Benton County, Arkansas in 2015, investigators filed a warrant to collect the audio recordings from an Amazon Alexa Echo device in the home of the homicide suspect. Amazon refused to provide the recordings for investigators out of respect to privacy and constitutional rights under the First Amendment. In a February 2017 court filing, Amazon pushed back against the warrant saying that they would not provide the recordings and transcripts "unless the Court finds that the State has met its heightened burden for compelled production of such materials". In March of 2017 James Bates' lawyer, Kathleen Zellner, filed a motion stating that Bates would provide the recordings voluntarily.  Authorities have asked for evidence from technological devices before, but this may have been the first time a smart speaker has been the holder of information, according to Joel Reidenberg, the Founding Academic Director of the Fordham University's Center on on Law and Information Policy.  While this is the first case Joel Reidenberg has seen involving smart speakers, he noted that he was not surprised. 
Voice Activated Device Calls 911 During Assault
In a New Mexico domestic violence incident, Eduardo Barros was charged with beating and threatening to kill his former girlfriend. Barros asked his girlfriend if she called the police, saying "Did you call the sheriff" and the voice activated device in their home, first identified as an Amazon Echo, responded to Barros's inquiry "Call the sheriff" and called 911  Without hearing a response from the 911 call, operators called the victims phone back. After seeing the call, Barros proceeded to push his girlfriend to the ground and kick her. When police and emergency services arrived, Barros was arrested after a two hour standoff with the swat team. After reaching out to Amazon and questioning how this 911 call was possible, a spokesperson responded by saying that the Echo does not have the built in capabilities to call the police. It was clear that a similar device was used in the instance. While in this scenario a device was able help save a life, it brings up concerns that the security functions of these devices are not strong enough. In the case with a 6-year-old ordering a doll house just through the sound of his voice, it took a simple command to perform an advanced function. A similar incident happened her, however the implications were far larger. It brings up the question of, how much capability should these devices have just through voice command?
Becoming A Robotic Society
Without producing an individual idea, “Alexa” agrees with any opinion one may have. Who wouldn’t choose that over approaching another human with their own point of view? Due to “Alexa’s” ease of availability and lack of opinion, a human would now prefer to direct questions at a machine. Without opinion and debate, there is no personal identity: we become a robotic society. The technology-run world we are headed towards does not seem filled with companionship or respect for others. Amazon intended for “Alexa” to be more of a shopping assistant, but she has become owners’ friend. Due to “her” lack of curated opinions, though, “Alexa” is bossed around like a machine, not at all treated like a friend. Interpersonal skills are not required anymore. Issues arise in human to human interactions due to impolite or disrespectful body language and tone of voice. Only 7% of any message is conveyed through words; the other 93 percent is nonverbal communication. “Alexa” cannot read these nonverbal cues, so humans speak to her in disrespectful ways without consequences. Taking feeling and emotion out of the equation, humans all become the same, like “Alexa.”
Google Home vs. Burger King
Although not an ethical implication of Amazon's Alexa specifically, a similar product to it, Google Home, was recently embroiled in controversy surrounding the privacy of its users. A fast food giant, Burger King, recently ran an advertisement that featured a man saying "OK Google, what is the Whoper burger?" The "OK Google" at the beginning of the statement serves as an initialization of the device, which responded to the question by rattling off the Whoper burger's Wikipedia entry, much to the irritation of Google Home owners that had tuned into the television at the time of the advertisement. When people caught wind of what Burger King was attempting to do, the Wikipedia entry for "Whoper burger" was changed to include other, inappropriate statements, such as listing one of the ingredients as cyanide. The advertisement sparked outrage and conversation about the ethics of this particular advertising maneuver by Burger King; similar to Google Home, the Amazon Alexa can be summoned and initialized by a fairly simple command: "Alexa."
As discussed in Protecting privacy in public? Surveillance technologies and the value of public places by Jason W. Patton, highly intelligent surveillance technologies that are accessible in public spaces are "detrimental to the social, cultural, and civic importance of these places." While Amazon Alexa and Google Home are not products in public places, the Burger King advertisement demonstrates that they have the ability to transform a private home into a public space when tapped by a third-party. Patton eventually comes to the conclusion that the protection of public spaces is just as important as individual privacies; people relish the value that a free environment can provide them and it is critical to avoid infringing on that freedom. Amazon Alexa and Google Home are not surveillance technologies, but their easy manipulation in not only private spaces, but potentially public ones, can infringe both individual rights and safety in public spaces.
The Use of Female/Women-Identified Voices as Artificial Intelligence Assistants
Of the big four major technology corporations, Apple uses the feminine voice of Siri, Microsoft uses the feminine voice of Cortana, Amazon uses the feminine voice of Alexa, Google uses a feminine voice on their Google Home assistants. While Apple and Google offer options to the user to change the voice of their assistants, not all platforms offer such options, and as it is, the feminine voices are the default settings. According to user research from the Nielsen Norman Group, default options are important choices made by interface designers, as defaults are seldom, if ever, changed by the majority of users.
Women are often expected to be in administrative roles, as secretaries, assistants, and other such service roles in the workplace. Additionally, women are often conditioned to display mothering behaviors, and this can affect the perception of women overall in the culture. Some say that this prevalence, which is due to many cultural and social factors in its own right, predisposes society to prefer listening to women's voices. It is worth noting, though, that feminine or women's voices are more heavily and acutely critiqued as grating, for participating in "uptalk," for having vocal fry, etc. than men's voices are.
One human-computer interaction ethicist posited that the stature of feminine voices as assistants in artificially intelligent software speaks to how the primarily masculine teams view and think about women, even stating that she feels it reflects that these men view women not only as subservient, but also as less than human. Regardless of this perception, the prevalence of so many feminine voices as artificially intelligent assistants "hard-codes a connection between a woman's voice and subservience." Unconscious bias comes from somewhere, and that somewhere is the cultural knowledge we "absorb from the world," that then is reflected outwards into the world unconsciously through individuals' choices, behaviors, and habits. Therefore ascribing women and feminine voices, personas, and names to subservient tools can promote these linkages between subservience and women, even more so than they are already entrenched in heteropatriarchal society.
By utilizing a feminine and woman-identified artificially intelligent operating system in the form of Alexa, these concepts of subservience and gender stereotypes have the possibility of being perpetuated and emphasized.
Ethical Skill Creation
To create more robust interactions between Alexa and the user, Amazon allows developers to create "Alexa Skills" where they have access to an array of API's to create voice skills for the public. Amazon provides extensive documentation for Alexa Skill Developers, including guidelines on how to design for the voice of Alexa. Under the Alexa Design Guideline, Amazon highlights offensive content Alexa Skill Developers should stay away from such a sensitive topics, sexual or medical content, politics and religion, profanity and derogatory terms, and artistic sources . When creating an Alexa Skill your skill must pass multiple tests in order for it to be publicly launched. In this process, Amazon is able to maintain a standard of quality as well as monitor ethical interactions of skills being produced.
Despite Amazon labeling medical content within skills as offensive content that breaks trust between users and Alexa, in April of 2019 Amazon announced a new category of skills -- Alexa Healthcare Skills  These skills walk a slippery slope as each Alexa Healthcare Skill must act in accordance to HIPPA's (U.S. Health Insurance Portability and Accountability Act of 1996) laws and legislation on privacy and patient-doctor confidentiality. Confidential information regarding medical history and patient records is not something to be taken lightly as this personal information risks aggregation as it enters the digital realm. David Shoemaker eludes to the idea that information privacy becomes even more important as technological breakthroughs continue to occur and aggregations of personal data is collected through nontrivial tasks . This combination of aggregated personal data through Alexa paired with confidential medical data serves as a huge opportunity for personal information to be compromised by hackers. Amazon remained vague on the specifics of what patient information Alexa will have access to. How Amazon uses patients personal data will be brought into question, as a company they gain access to not only medical records but also data from other services such as Amazon Marketplace, Prime Video, and Alexa to name a few.