April_ChatGPT|ChatGPT Technology Development and Advantage Analysis
ChatGPTAn artificial intelligence chatbot program to be launched in November 2022 by OpenAI, a not-for-profit artificial intelligence company founded in 2015 by Elon Musk and Samuel H. Altman and headquartered in San Francisco, with the goal of solving the dilemmas and problems encountered by human beings in science and technology as the basis for future artificial intelligence research. The company's mission is to create a foundation for future artificial intelligence research by solving the technological dilemmas and problems facing humanity. The company is comprised of leading researchers and scholars from around the world who are developing artificial intelligence technologies through machine learning and academic research to create value for the global community and help humans understand their natural environment more fully. in 2018, Elon Musk stepped down from the board of directors in consideration of the potential conflict of interest between Tesla's self-driving technology development and OpenAI, and Altman has since taken over the company's operations. The company's operations are in the hands of Altman, who is now responsible for the company's operations. Because it could not afford the high cost of long-term training models, OpenAI shifted to a limited profit model in 2019. Soon after the reorganization, it received a $1 billion investment from Microsoft to obtain priority rights to commercialize some of its AI technologies, and facilitated the cooperation between the two companies in developing artificial intelligence technologies for the Azure cloud platform service. After the popularity of ChatGPT, in 2023, Microsoft invested billions of dollars to import its search engine Bing and browser Edge into ChatGPT's language module, in order to capture the huge search market opportunity. This move has caused Google to feel threatened, and founders Sergey Brin and Larry Page have even gone back to the drawing board to supervise the research and development of artificial intelligence technology. Currently, OpenAI's main AI technologies include:
- Machine learning: automated learning and application of new knowledge to improve AI performance.
- Deep Learning: enables the development of deeper and more diverse applications that more effectively understand and simulate the behavior of artificial intelligence.
- Natural language processing: deeper understanding and simulation of human speech behavior.
- Autonomous Behavior: Understanding, Modeling, and Modeling Human Behavior.
表2、三種GPT模型的重要參數比較

資料來源 : OpenAI
因為訓練得到的模型不是非常可控,回饋到生成模型中之訓練資料分佈便是影響生成內容品質最重要的因素。有時候開發者希望模型並不僅僅只受訓練資料的影響,而且還是人為可控以保證生成資料的有用性、真實性和無害性。於是OpenAI使用人類反饋強化學習(RLHF)技術改進GPT-3模型,稱為InstructGPT。其方法是根據用戶向應用程式開發介面(API)提交的提示,由標記員向模型提供示範行為並對輸出進行排名來進行微調。InstructGPT可以更好地遵循人類指示,有害內容輸出也大幅降低。它雖然僅具有13億個參數,遠低於GPT-3模型,但研究人員使用自然語言處理效能評估方法來衡量其能力而發現兩者差不多。
ChatGPT是由GPT-3延伸出的GPT-3.5模型所製作,也是使人類反饋強化學習來訓練該模型,其訓練程序分成三步驟,如圖2所示。首先根據採集的資料集對GPT-3進行有監督的微調(SFT),其次是收集人工標注的對比資料來訓練獎勵模型(RM),最後是使用獎勵模型作為強化學習的優化目標,利用近端策略優化(PPO)演算法微調模型。ChatGPT與InstructGPT的資料收集方法略有不同,並加入強化學習近端策略優化,可以理解成在人腦思維的基礎上加入人類回饋系統,因此成文效果更真實、編碼能力更強而模型的無害性有些許提升。ChatGPT的技術優勢是採用自注意力機制,能夠更好理解語境並在產生文本時考慮到先前的對話內容,除了可快速產生高品質的文本外,還不需要任何額外的訓練就能在多種不同的領域中使用,並可進行如情感分析、關係推斷和情境建模等多種對話任務。而其技術仍有侷限性需要突破,包括:(1).ChatGPT輸出文本時效性受到OpenAI的模型資料庫更新頻率、資料來源影響,故可能出現不符現況之狀況。(2).ChatGPT只能基於現有資料輸出文本資訊,若資料庫欠缺特定領域資訊則生成的文本勢必不夠專業。(3).OpenAI訓練ChatGPT時通常使用大量經過人工或自動的過濾來排除生成不合適內容的文本,然而隨著模型的公開使用,有可能會出現某些不合適的資料被用於生成結果而導致準確性下降。
圖2、GPT-3.5與InstructGPT模型的訓練程序

資料來源 : OpenAI






