April_ChatGPT Discussion|Discussion on Business Opportunities and Business Models Driven by ChatGPT
Currently, OpenAI's business model is to collect service fees from users through applications, including the ChatGPT model for natural language processing, the DALL-E model for creating and editing raw images, and the Whisper model for automatic speech recognition. The company charges according to different AI models and needs, and the generated AI images and language text are each priced according to resolution and characters. This business model is called Artificial Intelligence Generated Content (AIGC), which is to create brand-new content, such as text, graphics, audio, music, video, code, various designs, marketing ads, 3D models, etc., through machine-learning models, and is applied across a wide range of applications. The application scenarios span operation, customer experience, product and service innovation, and involve mostly knowledge-based or creative work.
Internet content production has gone through three phases: professionally generated content (PGC), user-generated content (UGC), and artificial intelligence-generated content, etc. Most of the Internet content in the Web 1.0 era was high-quality text and video produced by professionals, while in the Web 2.0 era, users began to upload their content freely, and it is expected that in the Web 3.0 era, content will be gradually changed to be produced by artificial intelligence automatically and contribute to the development of the meta-universe because it requires a large amount of digital native content to be created with the help of artificial intelligence. It is expected that Web 3.0 content will gradually be automated by artificial intelligence, which will contribute to the development of the meta-universe, as it requires a large amount of digital native content to be created with the help of artificial intelligence. The development of AI-generated content technology can be broadly categorized into three phases, as described below.
- Early infancy 1950s to 1990s: limited by the state of the art, the technology was only used for experimental purposes. 1957 Lejaren Hiller and Leonard Isaacson completed the first ever computer-generated musical composition "Illiac Suite". 1966 Joseph Weizenbaum and Kenneth Colbv developed the world's first chatbot, "Eliza", which analyzed text input to transform it into a computer-controlled robot. "In 1966, Joseph Weizenbaum and Kenneth Colbv developed the world's first chatbot, Eliza, which analyzed text input and reassembled specific words and phrases into entirely new combinations. 1980s IBM developed the voice-controlled typewriter "Tangora", capable of processing 20,000 words.
- The 1990s and 2010s: The technology moved from experimental use to commercialization. Although deep learning, graphic processing unit, tensor processor and training data size have made significant breakthroughs, the application effect is still to be improved due to the limitation of algorithm development. 2007 Ross Goodwin, a researcher of artificial intelligence at New York University, completed the world's first novel "1 The Road" created by artificial intelligence. 2012 Micorsoft demonstrated a fully automatic simultaneous interpretation system through deep neural network. "In 2012, Micorsoft demonstrated a fully automatic simultaneous interpretation system, which, through deep neural network (DNN) technology, can automatically generate Chinese speech from English speakers through speech recognition, speech translation, and speech synthesis.
- The rapid growth period from 2010s to now: due to the significant progress of deep learning, the technology has made a breakthrough, and the related algorithms have exploded after 2022 to commercialize the technology, mainly focusing on the field of AI image generation, such as OpenAI's DALL-E, Meta's Make-A-Scene, Google's Imagen and Parti models of Google.
目前人工智慧生成內容技術可行的商業模式包括:(1).生成文字,如郵件、廣告文案等,目前大多數AI生成文字類項目都使用OpenAI的GPT-3模型。(2).繪製圖片,主要是結合多模態神經語言模型CLIP和圖像去噪擴散模型Diffusion,僅提供一些關鍵詞描述就可以自動生成圖片。(3).底層技術模型開發,目前以OpenAI和StableAl為該領域的領導廠商。接下來可能的熱門發展方向是用人工智慧生成視訊和動畫,目前Meta、Google等指標大廠正開發相關解決方案。
當生成式人工智慧技術日趨成熟後,預估其商機主要受惠是硬體的AI晶片商、軟體的AI演算法開發商與應用端的人工智慧生成內容服務商。該技術運作主要倚賴巨量資料運算,所以算力較算法重要。
ChatGPT採用自然語言處理技術開發的人工智慧模型,目前國內已有多家新創公司開發相關應用服務,如竹間智能科技、犀動智能科技、萬達人工智慧科技、網資科技、韜睿軟體、華碩AI研發中心,其中犀動智能科技已獲得OpenAI技術授權。該公司研發技術分成雲端語意解析、物聯網架構、數據加值分析等三類。雲端語意解析是在專業領域下提供親切如真人般的對話體驗。它的運作機制是首先進行自動語音識別,透過環境聲音降噪、回聲消除、聲學特徵提取等程序實現多國語言識別能力及交互體驗。接著進行自然語意理解,將文字的語意轉化成機器理解的內容。最後是進行語音合成技術,可以自然的真人發音及流暢的語速向使用者對話,支援多種語言和方言,能為應用服務打造個性對話。它的技術架構包含多意圖理解、專門領域知識圖、對話管理系統。[註解:多意圖理解是通過大量的文本語意及深度神經網絡技術,從對話中識別並跟進多個動作項或意圖,最終讓機器人可以按順序和邏輯方式執行任務,盡力滿足隱藏在對話內的商業場景需求。專門領域知識圖讓語音系統具備高度彈性的推理及思維能力,能有效深入各類專門領域以有效擴大對話範圍。對話管理系統是針對自然語言之意圖提供完整的對話流程與邏輯架構,可在語義不清時可快速提供語音反問以快速了解使用者的詢問意圖,降低判斷語音需求的錯誤。]
所採用的物聯網架構是以事件驅動體系架構(Event-driven Architecture)為主並搭配無線網路以進行數據自動流程管控,進而達成系統可擴與彈性。它的技術架構包括HydraLink、Elfin Control Agent與移動智慧路由(MIRF)架構,HydraLink是作為Modbus/TCP之代理主機間的介面,可支援工業領域通信協定接口,具有高效及穩定的控制單元傳輸設備特性。Elfin Control Agent是用於設備控制的代理人模型,可將語意內容轉換為信號以作為接入用戶執行所需的信息。移動智能路由框架可簡化無線設備的路由應用程序的編碼,加快應用程式開發過程並縮短自定義開發時間。數據加值分析是透過各項數據分析模式,找出關鍵指標以描繪出用戶輪廓,進而進行市場營銷、體驗優化、設備營運、管理監控。它的技術架構包括情感分析SDK、推薦系統,情感分析SDK是透過分析語意中的情緒,了解顧客對於產品或公司的整體觀感,藉此調整企業營運方向,並在服務中捕捉顧客對於產品的體驗觀感,協助企業了解顧客對於產品的評價。而推薦系統是透過蒐集並分析語意中的資訊以描繪出顧客輪廓並找出潛在需求,進而提供客製化的服務或是產品。目前犀動智能科技的營收主要來自於旅宿業應用方案,他們2019年推出的智慧語音管家服務,是透過智慧音箱協助飯店業者解決各式房務需求,並聲稱能協助經營層從大量數據挖掘商機。除了協助提供智慧語音管家服務、為業者節省服務人力外,也能透過雲端數據化後台助業者發現潛在問題及制定商業策略。目前已獲得國內多家四、五星級飯店採用,並積極拓展日本、馬來西亞、新加坡、泰國等亞洲市場。
另外還有其他商業模式或應用,如2023年推出以GPT-3語言模型發展出的Vocol AI語音協作平台,它除了可以將中文、英文、日文等語音檔案立即辨識而轉為文字,再由機器學習模型寫成摘要。還可針對不同講者、段落或是時間,整理出全文或是摘要,同時具備即時分享功能,提供多人協作模式。該平台鎖定在企業用戶、個人工作者。目前已經可以協助企業將會議做成摘要或是完整的會議紀錄,還可以運用在線上銷售判斷,從與客戶對談來判斷成交機率。在GPT模型持續發展下,我們將更加期待有更多顛覆習慣與傳統思維的應用推出。






