-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathHead On a Stick .txt
186 lines (163 loc) · 41 KB
/
Head On a Stick .txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
Head On A Stick: The Affordable Art Finish Platform for Humanoid Social Robotics Research and Development on the Desktop
Motivation
The potential of interacting with social humanoid robots has captured and lead the imagination of millions. However, the actual ability to conduct research with systems has been limited by the expense of their custom design and manufacturing. Just as early turnkey desktop computer systems in the 1980’s democratized computers and made them concrete and real for a generation, our goal is to make social robotics concrete and real by bringing them to the desktop today.
Customer
Our initial target audience will be Human-Humanoid Robotic Interaction researchers, virtual assistant developers, conversational agent developers, medical training and cognitive therapy developers, as well as for entertainment purposes. These customers require robust flexible tools that lets them focus on the innovative aspect of their research or application versus combating the many operational concerns required to utilize a head robot today.
Value Proposition
Full scale humanoid robotics have been both custom (one of a kind) and expensive. Existing android heads are in the $50K-$125K price range and require extensive artistic, mechanical, electrical and software support skills to operate and maintain.
The “Head on a Stick” (HOAS) project builds upon the existing Harmony platform currently in production. Our goal is to provide a high-quality android head with accessories (desk mount or full body) with voice and vision processing at an enabling price point around $10K for the first wave of academic researchers and commercial application developers. The HOAS will be substantially open source, designed for experimentation, address a number of basic issues faced by existing systems, and be both modular and customizable. In particular, we will provide tools enabling those focused on the technology as well as on social level interactions. Feedback from over a hundred early adopters in the entertainment industry as well as individual private collectors will also be incorporated into the new design.
Innovation
Realbotix has created a flexible platform that reduces the cost and time to customize the appearance and operation of our robotic heads. Existing robotic heads have a number of operational deficits in vision, audio, tactile sensing and control which will be systematically addressed by the project. The existing proprietary control system will be opened up to provide a platform suitable for both experimentation and education.
The Realbotix team has over two decades of experience in manufacturing art-finish humanoid faces and bodies. Realbotix has extensive experience in the area of humanoid social robots, and our previous work has been featured in The New York Times, Wired, and other popular media outlets. The PI has spent the last decade working in the domain of social robotics, the CEO has over 25 years of experience designing and manufacturing art-finish silicone heads and bodies, and the Engineering Director has experience with engineering and testing complex systems for both full-scale production and regulatory acceptance.
Commercial Opportunity
Commercial Potential: Overview
The market for social humanoid robots is large and growing. Depending on the market research definition used for social robotics, pre-pandemic the market was valued at $395 million in 2019 and was expected to grow to $699 million in 2023 and reach $918 million by 2026 (Knowledge Sourcing Intelligence, 2018). The market potential for our android head with accessories has been validated by the interest shown in social humanoid robots and direct inquiries by researchers, developers, and the general public. Our low-cost, easy-to-use humanoid social robot is expected to be a popular option for developers in the market for a social robot, including those in human-humanoid interaction research, virtual assistant development, conversational agent development, medical training and cognitive therapy development, education and entertainment. The market for social robotics is driven by the increasing demand for human-like interaction in both personal and business settings. Social robots can provide a more natural and engaging interaction than traditional user interfaces, which is appealing to both individuals and businesses. This project will transition social humanoid robots from being mere novelties to being more accessible and enable a greater number of researchers and developers to explore the true potential of social robotics.
Key Risks to Commercialization
We believe there are three major risks to bringing this solution to market. One risk is that errors in pricing/margin fail to make this a viable product line for the company due to manufacturing and support costs. Since we have already sold over 100 of the Harmony robotic heads, we feel comfortable estimating the additional development and manufacturing costs for the new system with vision, touch, and audio capabilities.
A related risk is that the size of the market isn’t big enough to support the company, or that other competitors will take market share. We feel this risk is manageable since we have already validated the market opportunity through discovery interviews with potential customers. The competition for social humanoid robots is currently dominated by expensive custom-made systems, so we feel confident that a lower-cost solution would generate a lot more interest in the entire social robotics concept. However, there is always the risk that new competitors may enter the market, or that larger/well-established humanoid robotics manufacturers can cut their costs substantially and compete for the same market space.
Finally, our most significant risk is that of intellectual property theft combined with supply chain issues. In order to mitigate these risks, we have started to purchase critical long lead time components. All mechanical components are manufactured in the USA and all molded components are made in-house. Additionally, all circuit boards will be manufactured in either the USA or in countries with strong intellectual property protections. Final assembly, programming, and testing will be done in our facility in Las Vegas.
The Summary of the Competitive Landscape
Social humanoid robots are increasingly popular, with a growing number of companies entering the market. The Realbotix system is unique in that it is hard to find direct competition in terms of 1:1 humanoid systems (Faraj, 2021). The R&D version of Hanson Robotics’ Sophia lists for $80K (Hanson Robotics, 2020), Engineered Arts’ RoboThespian lists of $90K, and the recently announced Engineered Arts Ameca list for around $130K (Reuters, 2021).
Realbotix will offer a unique combination of advanced features integrated into a turn-key “Head on a Stick” (HOAS) system at lower price point than that available from the competition. Key features include using the patented “Animagnetic” skin attachment system (McMullen, 2014) (McMullen, 2022), a suite of expressive and interactive services, integrated audio and vision processing capability, and an open-source development environment. Realbotix has developed a method that provides incredibly realistic human expressions using soft silicone formulas. This system allows faces to be partially decoupled from the underlying skull mechanism, and actuated through the magnetic attachment system. This capability opens the door for the rapid build out of a diverse set of face offerings by Realbotix, while also allowing others to create their own custom faces without needing to do full head designs. In terms of research, this system saves time and effort required by developers to build their own platform from scratch, while providing access to expressive face technology and opening up opportunities for interdisciplinary community building.
While the head is designed to operate on existing passive bodies as well as the desk-mounted bust, one can easily imagine several Realbotix heads being used in a lab or educational setting along with more expensive, full-body systems.
The competitive landscape is expected to change dramatically in the next few years, with our product well-positioned to take advantage of this growth. Competitors will likely enter the market and increase the number of social humanoid robots at more affordable prices. However, our product will remain higher quality and be competitively and maintain a unique value proposition that will set it apart from the competition. We believe that our product will help to drive down the cost of humanoid robots, making them more accessible to a wider range of users. This increase in competition will likely drive down the cost of social humanoid robots, making them more accessible to the average consumer, and may paradoxically also increase the overall size of the market as more people become aware of their capabilities.
However, the proposed system has been designed for high modularity and lower labor costs when assembling and repairing. This leads to a system that has high modification potential for the researcher, experimenter and artist. No other commercial system has taken this approach, especially at the $10K price point.
We estimate that the revenue potential for our innovation is $5M-$10M per year
Realbotix plans to market this product to academic researchers and commercial application developers, using a combination of online and offline marketing techniques. Based on market research and feedback from potential customers, we believe that the economic benefits of this innovation could be significant, including increased productivity and efficiency in the customers workflow as well as offer new opportunities for research and development.
University Research Market
The university research market for desktop social robotics is growing due to the increasing demand for robots that can interact with humans in a natural way. This market is expected to grow as universities increasingly adopt social robots into their research programs. The main drivers of demand for social robots in university research are the need for robots that can help researchers explore the complexities of human behavior and cognition, as well as the need for robots that can be used to study human emotions and reactions. Because social robots are in their infancy, the market for university research is one of the few places where they can be used to study the feasibility of human-humanoid interaction, and to develop the associated technology. Also, such systems can offer new opportunities for training for medical professionals and therapists.
According to the Carnegie Classification system as of 2021 (Carnegie, 2021) there are 1448 possibly relevant educational institutions in the US, including 559 research and technology focused programs. While not every institution is a perfect fit for this product, each of those that do offer relevant curricula would see opportunities for use across multiple domains (robotics, vision, AI, art) and highlight their interdisciplinary STEM offerings.
Realbotix is well positioned to capitalize on this growing market, as our technology is based on open-source platforms and our products are affordably priced. We offer a humanoid robot head that can be used for a wide range of research applications, from facial recognition to emotion recognition to human-robot interaction. This will make it an ideal platform for research in many different areas.
The Developer & Entertainment Markets
Virtual assistant and chatbot development is a growing market, and the HOAS platform will make it easy to create human-like robots with natural conversation abilities. There is also large and growing market for humanoid robots in entertainment. These include robots for personal companionship and providing service in the home. In each case, the humanoid form is seen as providing a more engaging and personal experience than traditional robots or other types of animatronics. The global market for entertainment robots is worth $1.5 billion in 2022 (projected to be $4.3 billion by 2028 (Hornyak, 2021) (Industry Research, 2022) ), and Realbotix can provide a high-quality, low-cost platform that makes it easy to create realistic and engaging human-like robots. Realbotix’s products have already been used both in audience-facing entertainment and in concept prototyping.
Commercial Potential: Financing/Revenue Model
Realbotix is part of Simulacra Corporation. Currently over a hundred robotic heads have been sold at $5K to $10K, depending on options. It is anticipated that most of the proposed HOAS systems will sell for $10K, with an initial target of 100 institutions each buying two systems in the first year, resulting in approximately $2M in revenue.
We feel this is conservative, and expect a higher number of interested institutions along with purchases from private owners and developers, as well as exploratory development by the entertainment and therapy markets. Outside of graduate level research a few potential application areas that will see interest in humanoid social robots include: education (tutoring, personalized learning, assistive technologies), healthcare (Remote patient monitoring, physical therapy, geriatric care, patient simulation), entertainment (companionship, autism assistance, marketing) and industry (Human-robot interface, customer service, telepresence). Indeed, the commercial market is potentially orders of magnitude larger, and the system is designed to be a first stage enabler for those interested in such applications.
Milestones to Bring to Market
Realbotix has delivered over a hundred of its existing Harmony robotic heads to developers in entertainment and private collectors. This has provided Realbotix with a wealth of real-world experience with developing manufacturing processes that are directly applicable to the HOAS system. Additional work is anticipated in the area of automatic system calibration (covered substantially by this proposal) and developer-quality system documentation. At the end of Phase I, the software should be in a state to form (at the minimum) the base for an updated reactive self-contained version of the existing system, and ideally become the basis for beta testing with academic and developer communities.
Deliverables
The proposed work in this project will result in the following deliverables:
1. A low-cost, humanoid robot head with integrated vision and voice processing, suitable for academic research and commercial development.
2. A turnkey platform for humanoid social robotics development, including a desktop mount and full body adapter.
3. Open-source software and documentation, making the platform accessible to a wide range of users.
Technical Solution
Attribute Current Harmony System BINA-48 Proposed HOAS System Manufacturer Realbotix Hanson Robotics Realbotix
16
Dual camera available over USB 3.0 to base or external processor
Object and face tracking and identification at frame rate
USB module/Linux sound system/Head integrated speakers
Microphone array embedded in base with
noise cancellation and source position localization
Capacitive/resistive touch
WiFi/USB/Ethernet
Standardized Nvidia
Nano/Xavier processor
(0.5 to 7 Tflops)
Open-sourced, Jupyterbased IDE
Open format/GPT-based
$8,500 to $12,500 (2023) Degrees of
Freedom/Motion 10 30 Video No video Mono/external Vision N/A Face
detection/recognition/
tracking Audio Output Head Integrated Bluetooth External Speakers Sound Input External monomicrophone in phone
or tablet External Hand microphone Tactile None None Communications Bluetooth/WiFi Wired/Serial Processing Variable phone/tablet External PC Control Software Proprietary Phone/Tablet App Proprietary cognitive system Virtual Agent Technology Fixed format/AIMLstyle Fixed format/LSA + AIML-style Price Point $5,000 to $10,000 (2020) $125,000 (2010)
Figure 1. (left) Early Internal V1 Desktop Prototype (middle, right) 10 Degrees of Freedom and primary face vectors of Harmony V1 (Additional DOF for lip sync to be added in this project)
General Technical Approach and Focus:
While the social robotics may capture and lead the imagination of many, the reality of existing systems has several technical deficits that need to be addressed before they can reach their true potential. The proposed project is to address these deficits, to make this class of social robotics both functional and have a low barrier of use for a wider audience, and thus be more effective in the marketplace.
The primary project technical risk is that the identified deficits cannot be corrected in an effective manner compatible with organizational skills, knowledge or methods. We propose to address this by conducting the focused research required to correct deficits in vision, acoustic and tactile sensing currently faced by most robot heads, and perform the basic development to create a robust, easy to use system geared toward social robotics researchers and developers. This project will focus on performing the tests required to collect the physical measurements required for quantifying or correcting each deficit and on the system requirements for a minimally viable system for research, development and education. Our primary goal is to offer a system that meets the expectations of developers, so they can focus more on their research interests and less on the details of the hardware or software.
Test fixtures to quantify physical performance in each sense modality will be constructed and used to collect baseline data and evaluate the performance of each development option. These test fixtures will enable test driven development of proposed components and will essential for insuring overall system robustness. In addition, these fixtures will test driven development and evaluation of the final integrated system hardware and software.
Each sense modality will have its own test set. For vision, an HD monitor-based system will be used to evaluate various optical qualities, and the performance of vision systems. The vision measurement system is required to finalize the design of the integrated camera and optics (McMullen, 2019), to obtain the calibration parameters for the computer vision systems and to test the vision tracking system. For audio, an acoustic test rig will be developed to evaluate the performance of various microphone array configurations and the performance of noise cancellation algorithms. It will also be instrumental in testing higher level functions like speech recognition, motor noise suppression, localization and reaction to non-visible sources. While Realbotix has pending patents on silicone-capacitive touch sensing, there is a need to quantify the performance through the variable structure of the face, along adapting to the changes inherent in a system that allows for face swapping. Most existing robotic heads simply ignore the tactile sensing modality, assuming contact will never occur. However, since detecting both near and actual contact enables the appearance of natural reactivity, integrating various tactile sensors will be examined along with a test fixture to evaluate each option’s performance.
The lack of a common software infrastructure or SDK across projects impedes rapid development, testing and creates problems with interoperability, robustness and ease of exploratory development. These deficits create a barrier to entry for developers new to the field, and limit the potential for novel applications and research. During proof-of-concept development various independent off-the-shelf modules are often interconnected, allowing quick, low-cost validation at a reduced engineering cost. However, these demonstration systems compromises violate system constraints in terms of physical volume, number of power sources, placement and number of connections desirable for commercial solutions.
To address the hardware deficits common to most head projects, a set of integrated controller boards are utilized in the design. These boards vastly simplify the assembly process, providing all the required power requirements to be met with a single USB-C connection. Additional control algorithms and intelligent protocol processing can be embedded in the servo controller system, along with interfaces to other industry standard buses (I2C, CAN Bus, RS-485, etc.)
The proposed research will build on the current frameworks to create open-source libraries that support a range of humanoid robotics capabilities. These libraries will provide functionality for basic motion control, vision processing, speech processing, and human- robot interaction. The main focus of this research is to develop an IDE and SDK for controlling the robot and to produce a working prototype that can be used by academic researchers and commercial application developers. Also, a high-level graphical user interface for personality specification and editing will provide a user-friendly interface for controlling the robot and allow the user to interact with the robot using vision and natural language.
The Plan to Protect IP and Leverage our Open-Source Offerings:
The company plans to protect its intellectual property while leveraging open-source development in order to maximizes compatibility with the academic and research community. The company has an interest in higher IP protection of the integrated physical elements (“the hardware”) versus the software and plans to file patent applications on physical systems (3 granted, 2 pending) and maintain select data sets as trade secrets (such as the processes and formulations used to produce the final product). While “custom face kits” may be one viable accessory product, such kits will not preclude providing high-end custom art-finish work as a service, and indeed may stimulate additional work for Realbotix in that area or spark a “skin” art community. Additional trade secret data will be generated focusing on improving embodied conversational interaction.
Intellectual Merits: Technical Discussion and R&D Plan
Creation of an android head by a researcher is currently difficult and complicated, requiring advanced skills in mechanical, electrical, and software engineering, as well as artistic and other disciplines (Faraj, 2021). The goal of this project is to remove these barriers by developing a low-cost, easily customizable, general-purpose humanoid android head platform and toolkit for both educational and commercial use. Just as early fully assembled personal computers lowered the barrier to entry for early enthusiasts and expanded the range of applications and end-users, we hope to do the same for humanoid social robotics.
The overall plan is to collect the information necessary to properly characterize and fix the identified deficits in our existing robotic head product, and then apply that information to construct a new protype and measure the systemic improvement. In preparation, steady progress has been made in the foundations of the overall system. Realbotix has completed dual camera testing, achieved object tracking of >90 FPS, and used an off-the-shelf microphone array for speech recognition and reflexive audio source tracking. We are currently working on improving facial recognition, text-to-speech (TTS), and developing the interface to a back-end server consisting of Jupyter Labs (Jupyter, 2022) and Eleuther AI GPT-J 6 billion parameter language model (Komatsuzaki, 2021). A unique method of adapting the GPT-J model using stochastic updates of model layers on commodity hardware has been contributed back to the developer community. In parallel, an initial set of researchers and developers will be contacted for the first round of prototype and commercialization testing.
Applied Research, Test Sets and System Development
We propose to create a low-cost, high-repeatability test set for each sense modality. For optics and vision hardware the setup can be used internally and replicated by external developers, allowing for rigorous testing of vision and audio processing algorithms. We plan to create a standard test bench with a head, camera, tested lens set and high resolution (4K) monitors. The current cost of a commodity, high-resolution monitor is less than that of one printed official video calibration test pattern. With the monitor and controlling software one has the options to do various forms of auto calibration and measure both static imaging quality and dynamic image tracking performance at a systemic level. The ground truth original image, what the camera captures WITHOUT the lens and what the camera captures WITH the lens can all be compared to measure the distortion and find the correction matrix required by OpenCV (OpenCV, 2015) and other image processors as well as evaluate the quality and field of vision of each examined lens configuration.
Hearing provides the ability both to respond to verbal interactions and also to detect events outside of visual range. It is anticipated that a microphone array will be embedded in the periphery of the bust base. A number of base microphone array configurations will be examined along with suppression of internal noise sources using either internal microphones or common source isolation algorithms. Given the ability of arrays to determine audio source through phase delay correlations, one should be able to determine and suppress internal noise sources from the audio stream using similar a similar process. We will construct a simple audio isolation test rig, able to accept and rotate a head relative to a sound source. Similar to the vision test rig, the multiple configurations can be tested in a reproducible manner.
While Realbotix has pending patents on silicone-capacitive touch sensing (Pirzchalski, 2020), solving the challenge of tactile sensing will require the most research. While capacitive touch is widely used, the challenges are unique in that any sensor used must function in the presence of the magnetic actuation hardware on a non-planar skull, through a silicone skin is of variable depth, that is in motion, and may be swapped. Fortunately, hardware probes exist for acceptance testing of capacitive touch devices, and the existing head has sufficient actuation allowing 2D resolution testing. Non-capacitive options will also be examined.
The hardware development plan is designed to address the three key hardware deficits currently faced by electronics package in robotic heads: power, connections, and volume. The Viziboard provides the required interface to the stereo cameras. The Headboard builds on the base of prior designs to provide a next generation system that will aggregate on board sensors (vision, audio, tactile), interface options (I2c, CAN Bus, IEEE-485, etc.) and provide the numerous power supplies required to drive all servos, cameras and sensors through a single USB-C connection. Finally, the Bodyboard in the pedestal embodies a custom carrier card for
Fig 2. (top) Proposed Inner face module with vision integration (bottom) v1 Headboard under test and planned next generation board with i.MX8M+ processor
an Nvidia embedded processor (Jetson Xavier NX / Jetson Nano) to host the primary software suite.
The system and user facing software will be developed in parallel to the hardware development. The software will be divided into three layers: a robust system of low-level control software, an intermediate level providing a foundation of vision, audio processing, tactile sensing, motion control and animation, and a high-level layer providing a more abstract “personality definition” interface. The second layer will provide a Jupyter Labs kernel to users to access the first layer API and GPT-J language modeling services for conversational interaction. Each layer will be designed to support multiple user / developer accounts.
Course of Work and Schedule
It is anticipated that the project will require 10 to 12 months to complete.
Month 1-2: Develop detailed design and plan of attack. Finalize Head physics package, add on-board processor and sensor aggregator.
During this time the detailed final design of the components required to conduct testing along with the test fixture designs will be completed along with specifying the measured parameters (Field-of-view angle, resolution, motion tracking) and the testing software defined. Cabling design between the camera, Viziboard and Headboard to simplify manufacturability is completed along with assembly and testing. Firmware to provide video to the Nvidia AI processor through the i.MX8M+ Headboard processor over USB3 is developed.
Month 3-4: Complete optics testing, acoustic testing, and tactile testing.
The bulk of setup and data collection would occur during this period. The test fixtures are completed as soon as possible and testing begins with the collection of baselines and ground truth reference data. The iterative process of evaluating various optical, acoustic and sensor design configurations begin and will extend for several months. Each configuration will be tested and evaluated for performance and manufacturability. This includes the evaluations of different speaker / microphone array options in addition to different soundproofing/noise reduction options. Detailed development of the far-field audio system for system base and localization capabilities will be tested. Also, the bulk of research on exterior lens materials/designs compatible with existing eye/iris designs to create “art-finish” exterior occurs during this time.
Month 5-6: Complete control and interaction software suite.
Camera research continues into new manufacturing methods to build new eyeballs with embedded cameras and into the use and effect of the embedded camera’s auto-focus feature.
Research is conducted into capacitive and resistive touch sensors compatible with the existing head design (and animagnetic system elements) for multiple places on the face and lips.
During this period the focus will be on completing the core software suite and using the test fixtures to systematically evaluate the performance of the integrated system. This includes creating a robust framework for OpenCV-based tracking and servo control as well as the object and face identification subsystems and their API. A higher-level object priority tracking scheme based on contextual relevance will be created to support dynamic attention. The conceptual “back-end” will be created, focused on create Large Language Models (LLM’s) using GPT-J to support a “personality system”. This will involve training both local and remote versions and accessing remote services (such as OpenAI, NovelAI, etc.) with proprietary datasets. A scripting framework based on Jupyter will be created for accessing the API and scripting personalities using available senses and the GPT-J system based on prior work. This will involve completing the design and implementation of a Jupyter kernel to interface with the control API, and provide an integrated configuration system for the head accessible from the Jupyter interface. Also, JupyterLab will be configured and tested for multi-user operation. Finally, a high-level webbased interface will be implemented that will focus on defining and managing characters, their attributes and interaction setting and contexts (“Albert Einstein tutoring advanced high-school students on general relativity”). This builds on prior work on character definition. (Coursey, 2020).
Month 7-8: Complete user facing software. Final hardware package defined for prototype manufacturing.
Advanced optics research is conducted based on prior research. Research into the use of fisheye or other “non-planar” camera lensing to increase viewing angles/etc. and into the use of infrared/other non-visible detectors to increase visual fidelity/object detection. This may be of interest for use in non-contact sensing for monitoring and medical applications. Information collected from testing and evaluation is integrated into a final manufactured prototype, including the art finish eyes with integrated optics, base with integrated acoustics, and head with integrated tactile sensing. The main processor package is integrated into the base along with power and external connections, resulting in a testable prototype. This period will focus on completing the user facing software for all three level of operation and creation of accompanying user documentation. The developed test fixtures will be reused to evaluate high-level performance when using the Jupyter and Personality-definition interface level.
Month 9-10: Interact with user community, complete final report and follow-on proposal. The final project report will be completed, along with release of any open-source packages and documentations. This will include documentation on low-level physical system, operating system and basic packages (“Shop-Manual”), intermediate level development using API and Jupyter level interface (“Developers Manual”), and high-level personality interface (“App Manual”). Initial contact with potential customers will start during the early months, but now initial user testing program will be finalized. Contact is planned with the community of OpenCV, robotics and social robotics researchers and educators. Also, feedback will be collected from those in the interactive character development community. Any follow-on proposal will be completed along with future commercialization evaluations.
The Company and Team
Realbotix was founded in 2015 by CEO Matt McMullen, with the help of Dr. Kino Coursey (Chief
AI Researcher), Susan Pirzchalski (Director of Engineering/Chief Robotics Engineer), Guile Lindroth (Chief AI and Content Specialist) and Yuri Machado (lead programmer and multiplatform developer) and is a privately held corporation headquartered in Las Vegas, Nevada.
Kino Coursey (as Primary Investigator) will be responsible for developing the open API to drive the robot head as well as developing the general cognitive architecture integrating the vision, audio, and touch subsystems. 100% of his time will be spent on this project. Kino has been responsible for the existing AI processing on the Harmony system (Coursey, 2019) (Coursey,
2020). He previously worked on the AI for Hanson Robotics’ Sophia (Wikipedia Sophia, 2022) and Bina48 (Hanson, 2022) (Wikipedia BINA48, 2022), as well as Robokind’s Zeno/Milo (NSF,2012) (NSF, 2013).
Susan Pirzchalski (as Director of Engineering/Chief Robotics Engineer) will be responsible for all of the hardware-related tasks including design of the camera subsystem (McMullen,2019), audio subsystem, and touch sensors (Pirzchalski,2020) being designed for this proposal. 100% of her time will be spent on this project. She is the systems architect for the Harmony system as well as the designer of the prototype mechanics of the Harmony head. She has designed the Headboard, Viziboard, and internal servo controller boards for the next-generation system as well as a number of CAN bus and Bluetooth-based sensor boards to be embedded in an external android body.
Matt McMullen is the CEO of Realbotix and will lend his extensive experience in sculpture and mold making towards the development of the art-finish eyeball and the embedded sensing for the face with an emphasis towards manufacturability. Matt also has experience with studio level acoustics and audio.
Realbotix has 11 employees working in California, Nevada, Texas, and Curitiba, Brazil. Our goal is to produce low-cost, easily customizable, real and virtual interactive humanoid simulations and to bring affordable home robots to consumers by reducing costs while increasing sophistication and quality. We aim to bring this technology to the market as an opensource system and a platform for developers. We have been involved in humanoid robotics research for the past decade and the company leverages the experience from a variety of academic, government and commercial backgrounds. We have experience with humanoids, real time 3D graphics, cloud computing, cognitive architectures, natural language processing, speech synthesis, facial recognition, data mining, web development and mobile development. The Realbotix team has a proven track record of bringing products to market, including the Harmony AI App for RealDoll which controls the only consumer purchasable android head and delivers a full-body interactive AI simulation that can engage in both virtual and real-time social interactions.
Over 100 Harmony robotic heads have been delivered to customers over the last two years. We believe that our sales of the Harmony heads have been hampered by the limitation of being tightly tethered to the phone app for all system processing. By adding a self-contained vision system and audio identification/localization capability this will allow the system to create the illusion of being more “alive” and “attentive”. With a more open software architecture, the system will allow users with different use cases the ability to create entirely different behaviors and personalities for the system.
Broader Impacts
This project has the potential to have a wide range of far-reaching consequences for both education and industry. It will help to democratize the use of social humanoid robots, making them more accessible to researchers from all disciplines, and to create opportunities for collaboration between different sectors of society. The project also has the potential to improve public scientific literacy, train the next generation of engineers and scientists, as well as expand participation in STEM fields by women and individuals from underrepresented groups. The broad impacts of this project highlights the importance of its objectives.
First, the ease of use and low cost of the system will make it more accessible to researchers, who can then use it to explore the potential of human-humanoid interaction in a wide range of domains. The system is designed to be easy to use and to require no specialized skills or knowledge to operate. The vast resource of online AI and research materials (GitHub, YouTube, etc.) can be directly interacted with in an engaging way. This will make it accessible to a wide range of users, from academic researchers to entertainment industry professionals. Second, the wide range of applications for which the system can be used will create opportunities for collaboration between researchers in different disciplines, and between industry and academia. This will help to train the next generation of engineers and scientists and to improve public engagement with science and technology. Our plan is to engage researchers interested in alpha and beta testing the system in an academic setting and to provide documentation sufficient to integrate the system into a graduate and undergraduate setting.
Finally, the system will also help to expand participation in STEM fields by women and individuals from underrepresented groups not just by literally "changing the face of robotics", but also by providing more accessible and capable tools. The project will enable a library of faces that can be used to represent a wide range of human features, contemporary and historic, which will help to diversify the field of robotics and encourage underrepresented groups to participate in STEM.
Figure 3. V1 type heads on a desktop stand
(Zhang, 2021) provides an overview of the current state of each of these areas from a general AI perspective. All of these anticipated outcomes of the project will have a positive impact on the economy and society as a whole, both in America and globally.
A list that enumerated the broad impacts of the project was given to OpenAI’s GPT-3 (Brown, 2020) (OpenAI, 2022) using their beta API, and it was asked to generate a response to the prompt given in bold below:
A simply eloquent speech that the holographic version of John F Kennedy gave explaining the broader Impacts of the proposed project:
My fellow Americans,
It is my privilege to speak with you today about a project that is dear to my heart, and that I believe will have a profound impact on the future of our nation and our world. This project is a collaboration between researchers in a variety of disciplines – art, psychology, computation, engineering, materials science – and it has the potential to change the way we think about one of the most fundamental aspects of human existence: communication. The goal of this project is to create a desktop social robot that is accessible to anyone, regardless of skill level or experience. This robot will be easy to use, and will come with a variety of faces and personalities that can be customized to meet the needs of each individual user.
But most importantly, this robot will be human-like in its appearance and behavior, and will be able to interact with humans in a natural and intuitive way.
I believe that this project will have a transformative effect on both the research community and the general public. It will provide researchers with a new tool for studying human-humanoid interaction, and it will allow civilians to engage with robotics in a more personal and engaging way.
Thank you for your time, and I hope that you will support this project.
Direct, robust interaction through the head, on the desktop, with such models and other AI systems will be possible at the completion of the project.
Realbotix: Head on a Stick 1
Realbotix: Head on a Stick 1
Realbotix: Head on a Stick 1