




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
數(shù)據(jù)驅(qū)動(dòng)引擎劉逸倫-華為-2012文本機(jī)器翻譯實(shí)驗(yàn)室團(tuán)隊(duì)介紹:華為文本機(jī)器翻譯實(shí)驗(yàn)室從AIOps到LLMOps:大模型強(qiáng)大的泛化能力和語言理解能力推動(dòng)AIOps發(fā)展語言上??蚣堋H说恼J(rèn)知世界與大模型數(shù)字世界的橋梁,解決模型意圖更清晰的推理路徑,提升人與模型的交互效率。題-答案對(duì)齊依賴。效的交互策略,生成符合人意圖效的交互策略,生成符合人意圖和需要的內(nèi)容。-答案上訓(xùn)練模型,幫助模型理解人的意圖和需要。痛點(diǎn)1:傳統(tǒng)智能運(yùn)維算法依賴于任務(wù)數(shù)據(jù),專家標(biāo)注耗時(shí)耗力痛點(diǎn)2:傳統(tǒng)運(yùn)維系統(tǒng)可解釋性差、可交互性弱痛點(diǎn)3:Prompt訓(xùn)練數(shù)據(jù)質(zhì)量不穩(wěn)定,導(dǎo)致模型性能下降痛點(diǎn)4:Prompt訓(xùn)練數(shù)據(jù)全面性不足,損害AI能力的全面性(ICSE2024/ICPC2024接收)型打造LLMOps數(shù)據(jù)飛輪(ICDE2024接收)?Logsaresemi-structuredteNewfeaturesNewfeatures01Motivation:Existingapproachreliesonmassivetrainingdataandlacksinterpretabilityanalysistaskswhentr(testedontheremaining20%or10%oflog?Performancedrasticallydeclineswhentrlimitedto10%orlo?Therelyingonmassivemethodsineffectiveandinflexiblein?Existingmethodsonlyoffe?Practitionersoftenneedtospendextresults)andact(identifyrootcauses,composereport,etc...).valueswithoutrationunderstandingofincidentMotivation:Largelanguagemodelshasthepotentialtoaddressthechallenges,andprompteLargelanguagemodels(LLMs)havepowerfulgenabilitytounseenuserinstralsobeabletohandleunseenlogsintheonlinesituationoflogInourpreliminaryexperChatGPTwiththesimpachievedanF1-scoreofonly0.189inanomalydetection.However,ourbestpromptstrategyoutperformedthesimplepromptby0.195Sinceloganalysisisadomain-specificandapplyingasimpleprompttoLLMscanresultinpoorUnlikeexistingdeeplearnreport,etc.Loginterpretawritingtask.TherearemanypromptphiloproposedinNLPtasks,suchasCoTheprimaryobjectiveofLogPromptinterpretabilityofloganalysisintheonlinescenario,throughproperstrategyofpromptingLLMs.Approach:Thechain-of-thought(CoT)promptsTheCoTPromptintheputanexamplewithintbeforeaninputmathpmodelisencouragedtofollowthethinkTheconceptofchainofthought(CoT),aseriesofintermediatereasoningsteps,isintroducedbyWeietal.[1].TheCoTpromptemulatesthehumanthoughtprocessbyrequiringthemodeltoincludethinkingstepswhenaddressingcomplexproblemsandcanenhancetheperformanceofLLMsinchallengingtasks,suchassolvingmathematicalproblems.AdvantagesofCoTpr?Breakdownunseenproblemsintomanageable?EnhanceinterpretabilityandtransparencyofLLMoutput?Unleashthelearnedabilitiesinthepre-trainingpha[1]J.Wei,X.Wang,D.Schuurmans,M.Bosma,F.Xia,E.Chi,Q.V.Le,D.Zhouetal.,“Chain-of-thoughtpromptingelicitsreasoninginlargelanguagemodels,”AdvancesinNeuralInformationProcessingSystems,vol.35,pp.24824–24837,2022.Approach:AdaptingtheCoTpromptintothefieldofloganalysisInmanualloganalysis,practitionersalsoengageinaseriesofreasoningstepstoreachanormallogandanabnormallogisunclear.Toemulatethethinki?ImplicitCoT:Humanmostlyhasreasonsbeforeconclusion.Thus,ijustifyingitsdecisions.?ExplicitCoT:Wefurtherexplicitlydefineintermediatestepstoregulatethethidefinitionofanomalytobeonly“alertsexplicStandardPromptPerforminganalysisbasedonTaskdescriptionandInputlogsStandardPromptLogPrompt(CoT)Approach:OtherpromptstrateIn-contextPrompt:ThisapproachussamplesoflabeledlogstosetthecontextfortheTheLLMthenpredictsonnewlogs,usingFormatFormatControl:Weemploytwofunctions,fx([X])andfZ([Z],toestarange,like“abinarychoicebetweenabnormnormal”,or“aparsedlogtemplSelf-prompt:thisstrategyinvolvestheLLownprompts.Ameta-promptdescribingthetaLLMtogeneratepromptprefixcaeffectivepromptchosenbased?TheeffectivenessofLogPromptisevaluatedmainlyusingthelogHubdatasets,whichcontainsreal-worldlogsfromninedifferentdomains,includingsupercomputers,distributedsystems,operatingsystems,mobilesystems,andserverapplications.?Twoofthedatasets(BGLandSpirit)wereannotatedbydomainexpertstoidentifyanomalouseventsforthepurposeofanomalydetection.?Toevaluatethelogparsingperformance,eightofthedatasetshavelogtemplatesmanuallyextractedfromasubsetof2000logmessagesineachdomain.?Alllogdataweretimestamped,enablingthedatasetstobesplitintotrain/testsetschronologically.?Inourprimaryexperiments,theunderlyingLLMisaccessedviaAPIsprovidedbyexternalservices.?Theinitialtemperaturecoefficientissetto0.5,maintainingabalancebyincreasingthemodel'sreasoningcapabilitiesthroughdiversetokenexplorationwhilelimitingdetrimentalrandomness.?Iftheresponseformatisinvalid,thequeryisresubmittedwithanincreasedtemperaturecoefficientof0.4untiltheresponseformatiscorrect.Theformatfailurerateislessthan1%,whichisconsistentwithexistingliterature.?Thetrain/testdatasetsaresplitchronologicallytosimulateonlinescenarios.Experiment:LogPrompthasastrongabilityofhandlingscarcityintrainingdata?TaskofLogParsing?Foreachdataset,mostbaselinemethodsaretrainedonthefirst10%logsandevaluatedontheremaining90%logs,whileLogPromptisdirectlytestedontheremaining90%logswithoutin-domaintrainingda?WeadopttheF1-scoreasthemetric.Tocalculateit,wetokenizethepredictedlogtemplateintoalistoftokens,thenwetreatthetokensasresultsfromaclassificationtaskof{template,variable}.?LogPromptachievedthebestF1-scoreonsixoftheeightdatasets,andoutperformedexistingmethodswhichrequireresourcesforin-domain?TaskofAnomalyDetection?Forbaselines,thefirst4000logsineachdatasetareusedfortraining.BothLogPromptandthetrainedbaselinesarethentestedonthe?Wereportthesession-levelF1-scoreofanomaly.Asessionisformedusingfixed-windowgroupingwithalengthof100logtemplate?Despitetheexistingmethodsbeingtrainedonthousandsoflogs,LogPromptstillachievedstrongperformancesinbothdatasetswithoututilizinganyin-domaintrainingdata,withanaverageimprovementof55.9%intermsofF1-score.TheadvantageofTheadvantageofLogPromptmakesitasuitablechoiceforloganalysisinonlinescenariosExperiment:LogPromptyieldhelpfulandcomprehensiblecontentforpractitionersduringreal-worldloganalysis?ANovelEvaluatingTaskofLogInterpretation?Atotalof200logswererandomlysampledforthehumanevaluation,accompaniedbyLogPrompt'sactualoutputs,with100logsrelatedtologparsingand100relatedtoanomaly(evenlydistributeda?Incorrectlypredictedlogs(FPsandFNs)werenotincludedinthisevaluation.Anequalnumberofnormalandabnormalsampleswereincludedforanomalydetection,andeachselectedlogforlogparsingwasrequiredtocontainatleasaccordingtothecriteria,independently?Wereportedtwometforbothtasksintermsofusefulnessandreadabilityconsistentlyexceededfour,andaverageHIPwasconsistentlyabove80%,indicatinganoverallhelpfulandreadablecontentconsideredbyexperiencedloganExperiment:MoreanalysisofLogPrompt’sinterpretability?FeedbacksfromPractitioners?BadCaseAnalysis?“IappreciatetheabilityofLogPromptinsupportingtheinterpretationoflogsfromvariousdomains.Asoursystemcontinuouslyincorporatingthird-partyservices,sometimesIhavetoreadthroughthemanualstodecipherlogsfromunfamiliardomains.TheexplanationsgeneratedbyLogPromptcanprovideaswiftgraspofthelogsbeforeIfoundtheofficialdefinitions.”?“LogPromptcandefinitelyhelpinthecomposeofashortlyaftersystemcrashes,whereIoftenneedtonon-technicalcolleaguesinmeetings.”?“IntherealmofsoftwareO&M,falsealarmsareaninescapablereality,withfalsepositivesimposingsubstantialtimecostsonengineecausingsevereramifications.Accompanyingexplanationswithautomaticanalysisoutcomesenablesengineerstomorepromptlyascertainthecredibilityofthepurportedanomaly,therebyreducingthetimespentonsubsequentactions.”?AmajorfactorofbadcasesistheLLM’slackofdomainknowledge,whichleadstooverlygeneralinterpretationsofsomedomain-specificterms.Forexample,itmayrefspecificparametersassiAnothercauseisthelackofsemanticconttlogs,whichcanbeattributedtotheirbrevityorrichnessofnon-NLPpatterns(i.e.,digits,codesandaddresses).Experiment:AblationstudyonthethreepromptstrategiesComparedtoprompt5,prompt2providesmoreformalandaccuratewords(suchas“standardized”and“convert”)andclearlyoutlinestheintermediatestepsforthetask(identifyingandreplacingvariables,thenconvertingtoatemplate).Interestingly,onlyutilizingtheimplicitCoT(requiringgeneratingreasons)canstillimprovethemodelperformance,likelyduetothereasonthatwithmoreNLPexplanations,thedistributionofthegeneratedanswersismoreclosetothatinthepre-trainingphasesofmodels.AnoverlylongcontextprefixedtothepromptmaycauseLLMstopaylessattentiothenewinputlogs,therebydeterioratingthetaskperformance,whichiswhythepeakisFuturework:Domainadaptingsmaller-scaledLLMsforcompatibilitywithadvancedpromptstrategies?ApplyingLogPrompttoasmaller-scaledLLM:Vicuna13B?Thedeploymentoflarge,proprietary,andAPI-dependentLLMsinlocalenvironmentscanbeachallengingtask.?ThereliabilityofservicesmaybecompromisediftheAPIservicesofLLMsbecomeunavailable.?Therefore,forindustrialusage,itiscrucialforLogPrompttohavecompatibilitywithalternativeopen-source,privacy-protected,andsmaller-scaleLLMs.OnlineLogParsingwithsmaller-scaledLLMsusin?AlthoughVicunahasonly13Bparameters,whenequippedwithLogPrompt,itachievesacomparableperformanceoflogparsingwiththe175BGPTmodelinthedatasetsofHDFS,LinuxandProxifier.?Additionally,whenthepromptstrategytransitionsfromasimpleprompttoLogPrompt,Vicunaexhibitssignificantimprovementsonperformance,withanaverageincreaseof380.7%inF1-score.?AsVicunaisopen-sourceandrequiresfewerresourcesfortraininganddeployment,thesuccessofLogPromptonVicunaholdspromisingimplicationsforbuildingin-domainindustrialapplications.?Theperformanceofsmaller-scaledLLMslikeVicunastillhasaroomforimprovement.Sincethebasemodel
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 課程內(nèi)容與教學(xué)目標(biāo)匹配度考核試卷
- 售后服務(wù)效果評(píng)估考核試卷
- java傳遞string面試題及答案
- 創(chuàng)新人才跨文化溝通能力培訓(xùn)方法研究考核試卷
- 喘息服務(wù)面試題及答案
- 肉產(chǎn)品考試題及答案
- 抵押合同模板
- 學(xué)習(xí)能力培養(yǎng)與核心素養(yǎng)
- 2024年山東省菏澤市東明縣中考化學(xué)三模試卷(含解析)
- 大學(xué)生網(wǎng)絡(luò)貸款安全教育
- 新生兒吸入性肺炎的觀察與護(hù)理
- 醫(yī)院布草洗滌服務(wù)方案(技術(shù)方案)
- 嬰幼兒輔食添加課件
- 單片機(jī)課程設(shè)計(jì)之超聲波測(cè)距-電子工程系單片機(jī)課程設(shè)計(jì)報(bào)告
- 寧騷公共政策學(xué)
- 地下室頂板行車與堆載驗(yàn)算與加固方案
- 四年級(jí)閱讀訓(xùn)練概括文章主要內(nèi)容(完美)
- GB/T 37234-2018文件鑒定通用規(guī)范
- 高中英語讀后續(xù)寫教學(xué)策略的探究
- 2022年動(dòng)畫制作行業(yè)分析及未來五至十年行業(yè)發(fā)展報(bào)告
- 染缸操作規(guī)范
評(píng)論
0/150
提交評(píng)論