打造LLMOps時(shí)代Prompt數(shù)據(jù)驅(qū)動(dòng)引擎_第1頁
打造LLMOps時(shí)代Prompt數(shù)據(jù)驅(qū)動(dòng)引擎_第2頁
打造LLMOps時(shí)代Prompt數(shù)據(jù)驅(qū)動(dòng)引擎_第3頁
打造LLMOps時(shí)代Prompt數(shù)據(jù)驅(qū)動(dòng)引擎_第4頁
打造LLMOps時(shí)代Prompt數(shù)據(jù)驅(qū)動(dòng)引擎_第5頁
已閱讀5頁,還剩52頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

數(shù)據(jù)驅(qū)動(dòng)引擎劉逸倫-華為-2012文本機(jī)器翻譯實(shí)驗(yàn)室團(tuán)隊(duì)介紹:華為文本機(jī)器翻譯實(shí)驗(yàn)室從AIOps到LLMOps:大模型強(qiáng)大的泛化能力和語言理解能力推動(dòng)AIOps發(fā)展語言上??蚣堋H说恼J(rèn)知世界與大模型數(shù)字世界的橋梁,解決模型意圖更清晰的推理路徑,提升人與模型的交互效率。題-答案對(duì)齊依賴。效的交互策略,生成符合人意圖效的交互策略,生成符合人意圖和需要的內(nèi)容。-答案上訓(xùn)練模型,幫助模型理解人的意圖和需要。痛點(diǎn)1:傳統(tǒng)智能運(yùn)維算法依賴于任務(wù)數(shù)據(jù),專家標(biāo)注耗時(shí)耗力痛點(diǎn)2:傳統(tǒng)運(yùn)維系統(tǒng)可解釋性差、可交互性弱痛點(diǎn)3:Prompt訓(xùn)練數(shù)據(jù)質(zhì)量不穩(wěn)定,導(dǎo)致模型性能下降痛點(diǎn)4:Prompt訓(xùn)練數(shù)據(jù)全面性不足,損害AI能力的全面性(ICSE2024/ICPC2024接收)型打造LLMOps數(shù)據(jù)飛輪(ICDE2024接收)?Logsaresemi-structuredteNewfeaturesNewfeatures01Motivation:Existingapproachreliesonmassivetrainingdataandlacksinterpretabilityanalysistaskswhentr(testedontheremaining20%or10%oflog?Performancedrasticallydeclineswhentrlimitedto10%orlo?Therelyingonmassivemethodsineffectiveandinflexiblein?Existingmethodsonlyoffe?Practitionersoftenneedtospendextresults)andact(identifyrootcauses,composereport,etc...).valueswithoutrationunderstandingofincidentMotivation:Largelanguagemodelshasthepotentialtoaddressthechallenges,andprompteLargelanguagemodels(LLMs)havepowerfulgenabilitytounseenuserinstralsobeabletohandleunseenlogsintheonlinesituationoflogInourpreliminaryexperChatGPTwiththesimpachievedanF1-scoreofonly0.189inanomalydetection.However,ourbestpromptstrategyoutperformedthesimplepromptby0.195Sinceloganalysisisadomain-specificandapplyingasimpleprompttoLLMscanresultinpoorUnlikeexistingdeeplearnreport,etc.Loginterpretawritingtask.TherearemanypromptphiloproposedinNLPtasks,suchasCoTheprimaryobjectiveofLogPromptinterpretabilityofloganalysisintheonlinescenario,throughproperstrategyofpromptingLLMs.Approach:Thechain-of-thought(CoT)promptsTheCoTPromptintheputanexamplewithintbeforeaninputmathpmodelisencouragedtofollowthethinkTheconceptofchainofthought(CoT),aseriesofintermediatereasoningsteps,isintroducedbyWeietal.[1].TheCoTpromptemulatesthehumanthoughtprocessbyrequiringthemodeltoincludethinkingstepswhenaddressingcomplexproblemsandcanenhancetheperformanceofLLMsinchallengingtasks,suchassolvingmathematicalproblems.AdvantagesofCoTpr?Breakdownunseenproblemsintomanageable?EnhanceinterpretabilityandtransparencyofLLMoutput?Unleashthelearnedabilitiesinthepre-trainingpha[1]J.Wei,X.Wang,D.Schuurmans,M.Bosma,F.Xia,E.Chi,Q.V.Le,D.Zhouetal.,“Chain-of-thoughtpromptingelicitsreasoninginlargelanguagemodels,”AdvancesinNeuralInformationProcessingSystems,vol.35,pp.24824–24837,2022.Approach:AdaptingtheCoTpromptintothefieldofloganalysisInmanualloganalysis,practitionersalsoengageinaseriesofreasoningstepstoreachanormallogandanabnormallogisunclear.Toemulatethethinki?ImplicitCoT:Humanmostlyhasreasonsbeforeconclusion.Thus,ijustifyingitsdecisions.?ExplicitCoT:Wefurtherexplicitlydefineintermediatestepstoregulatethethidefinitionofanomalytobeonly“alertsexplicStandardPromptPerforminganalysisbasedonTaskdescriptionandInputlogsStandardPromptLogPrompt(CoT)Approach:OtherpromptstrateIn-contextPrompt:ThisapproachussamplesoflabeledlogstosetthecontextfortheTheLLMthenpredictsonnewlogs,usingFormatFormatControl:Weemploytwofunctions,fx([X])andfZ([Z],toestarange,like“abinarychoicebetweenabnormnormal”,or“aparsedlogtemplSelf-prompt:thisstrategyinvolvestheLLownprompts.Ameta-promptdescribingthetaLLMtogeneratepromptprefixcaeffectivepromptchosenbased?TheeffectivenessofLogPromptisevaluatedmainlyusingthelogHubdatasets,whichcontainsreal-worldlogsfromninedifferentdomains,includingsupercomputers,distributedsystems,operatingsystems,mobilesystems,andserverapplications.?Twoofthedatasets(BGLandSpirit)wereannotatedbydomainexpertstoidentifyanomalouseventsforthepurposeofanomalydetection.?Toevaluatethelogparsingperformance,eightofthedatasetshavelogtemplatesmanuallyextractedfromasubsetof2000logmessagesineachdomain.?Alllogdataweretimestamped,enablingthedatasetstobesplitintotrain/testsetschronologically.?Inourprimaryexperiments,theunderlyingLLMisaccessedviaAPIsprovidedbyexternalservices.?Theinitialtemperaturecoefficientissetto0.5,maintainingabalancebyincreasingthemodel'sreasoningcapabilitiesthroughdiversetokenexplorationwhilelimitingdetrimentalrandomness.?Iftheresponseformatisinvalid,thequeryisresubmittedwithanincreasedtemperaturecoefficientof0.4untiltheresponseformatiscorrect.Theformatfailurerateislessthan1%,whichisconsistentwithexistingliterature.?Thetrain/testdatasetsaresplitchronologicallytosimulateonlinescenarios.Experiment:LogPrompthasastrongabilityofhandlingscarcityintrainingdata?TaskofLogParsing?Foreachdataset,mostbaselinemethodsaretrainedonthefirst10%logsandevaluatedontheremaining90%logs,whileLogPromptisdirectlytestedontheremaining90%logswithoutin-domaintrainingda?WeadopttheF1-scoreasthemetric.Tocalculateit,wetokenizethepredictedlogtemplateintoalistoftokens,thenwetreatthetokensasresultsfromaclassificationtaskof{template,variable}.?LogPromptachievedthebestF1-scoreonsixoftheeightdatasets,andoutperformedexistingmethodswhichrequireresourcesforin-domain?TaskofAnomalyDetection?Forbaselines,thefirst4000logsineachdatasetareusedfortraining.BothLogPromptandthetrainedbaselinesarethentestedonthe?Wereportthesession-levelF1-scoreofanomaly.Asessionisformedusingfixed-windowgroupingwithalengthof100logtemplate?Despitetheexistingmethodsbeingtrainedonthousandsoflogs,LogPromptstillachievedstrongperformancesinbothdatasetswithoututilizinganyin-domaintrainingdata,withanaverageimprovementof55.9%intermsofF1-score.TheadvantageofTheadvantageofLogPromptmakesitasuitablechoiceforloganalysisinonlinescenariosExperiment:LogPromptyieldhelpfulandcomprehensiblecontentforpractitionersduringreal-worldloganalysis?ANovelEvaluatingTaskofLogInterpretation?Atotalof200logswererandomlysampledforthehumanevaluation,accompaniedbyLogPrompt'sactualoutputs,with100logsrelatedtologparsingand100relatedtoanomaly(evenlydistributeda?Incorrectlypredictedlogs(FPsandFNs)werenotincludedinthisevaluation.Anequalnumberofnormalandabnormalsampleswereincludedforanomalydetection,andeachselectedlogforlogparsingwasrequiredtocontainatleasaccordingtothecriteria,independently?Wereportedtwometforbothtasksintermsofusefulnessandreadabilityconsistentlyexceededfour,andaverageHIPwasconsistentlyabove80%,indicatinganoverallhelpfulandreadablecontentconsideredbyexperiencedloganExperiment:MoreanalysisofLogPrompt’sinterpretability?FeedbacksfromPractitioners?BadCaseAnalysis?“IappreciatetheabilityofLogPromptinsupportingtheinterpretationoflogsfromvariousdomains.Asoursystemcontinuouslyincorporatingthird-partyservices,sometimesIhavetoreadthroughthemanualstodecipherlogsfromunfamiliardomains.TheexplanationsgeneratedbyLogPromptcanprovideaswiftgraspofthelogsbeforeIfoundtheofficialdefinitions.”?“LogPromptcandefinitelyhelpinthecomposeofashortlyaftersystemcrashes,whereIoftenneedtonon-technicalcolleaguesinmeetings.”?“IntherealmofsoftwareO&M,falsealarmsareaninescapablereality,withfalsepositivesimposingsubstantialtimecostsonengineecausingsevereramifications.Accompanyingexplanationswithautomaticanalysisoutcomesenablesengineerstomorepromptlyascertainthecredibilityofthepurportedanomaly,therebyreducingthetimespentonsubsequentactions.”?AmajorfactorofbadcasesistheLLM’slackofdomainknowledge,whichleadstooverlygeneralinterpretationsofsomedomain-specificterms.Forexample,itmayrefspecificparametersassiAnothercauseisthelackofsemanticconttlogs,whichcanbeattributedtotheirbrevityorrichnessofnon-NLPpatterns(i.e.,digits,codesandaddresses).Experiment:AblationstudyonthethreepromptstrategiesComparedtoprompt5,prompt2providesmoreformalandaccuratewords(suchas“standardized”and“convert”)andclearlyoutlinestheintermediatestepsforthetask(identifyingandreplacingvariables,thenconvertingtoatemplate).Interestingly,onlyutilizingtheimplicitCoT(requiringgeneratingreasons)canstillimprovethemodelperformance,likelyduetothereasonthatwithmoreNLPexplanations,thedistributionofthegeneratedanswersismoreclosetothatinthepre-trainingphasesofmodels.AnoverlylongcontextprefixedtothepromptmaycauseLLMstopaylessattentiothenewinputlogs,therebydeterioratingthetaskperformance,whichiswhythepeakisFuturework:Domainadaptingsmaller-scaledLLMsforcompatibilitywithadvancedpromptstrategies?ApplyingLogPrompttoasmaller-scaledLLM:Vicuna13B?Thedeploymentoflarge,proprietary,andAPI-dependentLLMsinlocalenvironmentscanbeachallengingtask.?ThereliabilityofservicesmaybecompromisediftheAPIservicesofLLMsbecomeunavailable.?Therefore,forindustrialusage,itiscrucialforLogPrompttohavecompatibilitywithalternativeopen-source,privacy-protected,andsmaller-scaleLLMs.OnlineLogParsingwithsmaller-scaledLLMsusin?AlthoughVicunahasonly13Bparameters,whenequippedwithLogPrompt,itachievesacomparableperformanceoflogparsingwiththe175BGPTmodelinthedatasetsofHDFS,LinuxandProxifier.?Additionally,whenthepromptstrategytransitionsfromasimpleprompttoLogPrompt,Vicunaexhibitssignificantimprovementsonperformance,withanaverageincreaseof380.7%inF1-score.?AsVicunaisopen-sourceandrequiresfewerresourcesfortraininganddeployment,thesuccessofLogPromptonVicunaholdspromisingimplicationsforbuildingin-domainindustrialapplications.?Theperformanceofsmaller-scaledLLMslikeVicunastillhasaroomforimprovement.Sincethebasemodel

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論