\TitleFontLight curve completion and forecasting using fast and scalable Gaussian processes (MuyGPs)
Abstract
Temporalvariationsofapparentmagnitude,calledlightcurves,areobservationalstatisticsofinterestcapturedbytelescopesoverlongperiodsoftime.LightcurvesaffordtheexplorationofSpaceDomainAwareness(SDA)objectivessuchasobjectidentificationorposeestimationaslatentvariableinferenceproblems.Ground-basedobservationsfromcommercialofftheshelf(COTS)camerasremaininexpensivecomparedtohigherprecisioninstruments,however,limitedsensoravailabilitycombinedwithnoisierobservationscanproducegappytime-seriesdatathatcanbedifficulttomodel.Theseexternalfactorsconfoundtheautomatedexploitationoflightcurves,whichmakeslightcurvepredictionandextrapolationacrucialproblemforapplications.Traditionally,imageortime-seriescompletionproblemshavebeenapproachedwithdiffusion-basedorexemplar-basedmethods.Morerecently,DeepNeuralNetworks(DNNs)havebecomethetoolofchoiceduetotheirempiricalsuccessatlearningcomplexnonlinearembeddings.However,DNNsoftenrequirelargetrainingdatathatarenotnecessarilyavailablewhenlookingatuniquefeaturesofalightcurveofasinglesatellite.Inthispaper,wepresentanovelapproachtopredictingmissingandfuturedatapointsoflightcurvesusingGaussianProcesses(GPs).GPsarenon-linearprobabilisticmodelsthatinferposteriordistributionsoverfunctionsandnaturallyquantifyuncertainty.However,thecubicscalingofGPinferenceandtrainingisamajorbarriertotheiradoptioninapplications.Inparticular,asinglelightcurvecanfeaturehundredsofthousandsofobservations,whichiswellbeyondthepracticalrealizationlimitsofaconventionalGPonasinglemachine.Consequently,weemployMuyGPs,ascalableframeworkforhyperparameterestimationofGPmodelsthatusesnearestneighborssparsificationandlocalcross-validation.MuyGPsallowsustoquicklytrainmodelsonlightcurvedatainsecondsonasingleworkstation.Inthismanuscriptweexplorelightcurvecompletionandforecasting,andcompareembeddingsofthelightcurvedataintomulti-dimensionalspacestotakeadvantageofdailyandyearlyperiodicity.WeshowthatourmethodoutperformsfeedforwardDNNsbothintermsofaccuracyandquantityoftrainingdatarequired.
LLNL-PROC-839253
1 Introduction
Photometriclightcurvesareaseriesofobservationsthattrackthebrightnessofanobjectoveraperiodoftime.Theycanbeusedtocharacterizethedynamicpropertiesofasystem.Astronomersfrequentlyanalyzelightcurvestounderstandawholerangeofastrophysicaltopicsincluding:detailedphysicswithinastar [gaia2019],thediscoveryofexoplanets [kepler2013],classifyingdistantsupernovae [des2022],andcharacterizingthepopulationofnearearthasteroids [atlas2018].SpaceDomainAwareness(SDA)involvesmonitoring,detectingandunderstandingthepopulationofearth-orbitingbodiesorresidentspaceobjects(RSOs).SDAisofincreasingimportanceduetotheincipientgrowthinthenumberofspaceobjectsanddebrisorbitingtheearth,dueinlargeparttotherecentcommercialdevelopmentofsatellites.ThenumbersosuchRSOsisexpectedtogrowbyordersofmagnitudeinthenextdecade.WhileSDAisconcernedwithmaintainingcustodyoforbitingobjects,italsoprioritizestherapididentificationofchangestoandanomalousbehaviorofthemanyvaryingorbitalsystems.Fortunately,lightcurvesarevaluableforinspectingRSOsinmuchthesamewayasastrophysicalphenomena,andcanenablecriticalinformationforSDA.ThebrightnessofanRSOdependsonstructuralfeatureslikesize,shape,andmaterialcompositionaswellthegeometrybetweentheobject,thesunandobserver.Theproliferationoflowcostcommercial-off-the-shelf(COTS)ground-basedtelescopeshasmadethegenerationoflightcurvesfromRSOseasiertoproduce.Furthermore,manyconstellationsofground-basedtelescopeshavebeentaskedwithtrackingRSOsforSDA.ThesefactorshaveenabledtherelativelycheapproductionoflargevolumesofRSOlightcurves,whichhaspromptedpractitionerstoapplyautomationtoanalyzethematscale.PractitionershaverecentlyappliedmachinelearningtolightcurvesofRSOstosolvevariousSDAtasks.Forexample,comparingthelightcurveofanunknownRSOtoacatalogofknownRSOsisusefulforpredictingfeatureslikeshape [linares2014space, furfaro2019shape],materialcomposition [dianetti2019space],andgeneralcategoriesorclasses [linares2016space, jia2018space, furfaro2018space].Furthermore,forecastingRSOlightcurvesintothefutureisusefulfordetectingdeviationsfromtheexpectedpatterns-of-lifeinnear-real-time,affordingthedetectionofanomalouseventssuchasmaneuvers [shabarekh2016novel, dupree2021time]orconfigurationchanges.Thegoalofmachinelearninginthiscontextistolearnafunctionmappinganinputspace(typicallythetimedomain)toaanobservationspace,e.g.magnitude,baseduponindependentandidenticallydistributed(i.i.d.)samplesfromadistributionofinput-observationpairs.Wewillrefertothisdatadistributionasthe“targetdistribution”.Amachinelearningmodelissuccessfulwhenitisabletoaccuratelypredicttheobservationofanunseeninputdrawnfromthetargetdistribution.Bothdiffusion-basedorexemplar-basedmethodshavebeentraditionallyusedinimageortime-seriescompletionproblems.Diffusion-basedmethods [sohl-dickstein2015deep]usethermaldiffusionequationstopropagateinformationfromsurroundingregionsintomissingregions,theyaremostlyeffectivewhenthegapsaresmallandtendtosmoothdetailsoutforlargerproblems.Exemplar-basedmethods [criminisi2004region]usegreedyalgorithmstoapplypatchesoftrainingdatatomissingregionswhichcanproduceimplausiblepatterns.DeepNeuralNetworks(DNNs)areanespeciallypopulartooltomodellightcurvesintheliteratureduetotheirexpressivenessandgeneralizationcapabilities.ADNNisatypeofrepresentationlearningmodel—amachinelearningmodelthatlearnsanappropriatefeaturerepresentationofthedatainadditiontoproducingpredictions.DNNsiterativelytransformtheinputspaceintolatentfeaturespacesusinglinearfunctionsthatare“activated”byelement-wisenonlinearfunctions.DNNsalsohappentobeuniversalfunctionapproximators—itispossibletoapproximateanycontinuousfunctiontoanarbitrarylevelofprecisionusingasufficientlylargeDNN.ThepersistentpopularityofDNNsderivesfromseveralsources,suchasthewidespreadavailabilityofhardwareacceleratorslikegraphicalprocessingunits(GPUs),advancedstochasticoptimizationtechniquesthataidintheirtraining,andthedevelopmentofuser-friendlysoftwarelibrariessuchasTensorflow [Abadi_TensorFlow_Large-scale_machine_2015]andPyTorch [NEURIPS2019_9015].AlthoughDNNshavemanypositivefeatures,theyalsohavesomedrawbacksthatareparticularlynotableinthelightcurveproblem.First,modernDNNarchitecturestypicallyconsistofaverylargenumberofparametersthatmustbetrained.Trainingsuchmodelsgenerallyconsumesalargeamountofcomputingresources,asthemodelanditsgradientareevaluatedovermanyiterationsinordertorefineparametervalues.Inadditiontocomputationalexpense,trainingalargemodeltypicallyrequiresalargeamountofdata.Theliteraturehasobservedaroughlylinearrelationshipbetweenmodelsizeandtheamountoflabelleddatarequiredtotrainit [tan2018survey].Thismeansthatlearninganaccuratelightcurvemodelcanrequirealargenumberofobservationsingeneral.Second,whileDNNmodelslendthemselvestohighdimensionalfeaturespacestheytendtostrugglewithverysmallnumbersofdimensions.Featureengineeringcanoftensolvethisproblem,especiallyintimeseriesscenarioswheresomenotionofamovingwindowusuallyservesasafeaturespace.However,thisstrategyismostsuccessfulwhentheobservationcadenceofthelightcurveishigh.Upcomingground-basedsurveysofdeepspace,suchastheLegacySurveyofSpaceandTime(LSST)areexpectedtoincidentallycapturemanyRSOimagesfromwhichlightcurvescanbeextracted.However,theselightcurveswillbeirregular,sparse,andhavelow-dimensionalfeaturespaces,increasingthechallengeofapplyingexistingtechniques.Third,effectivetrainingtypicallyrequiresalargevolumeoftrainingdatathatiscompleteandrepresentativeofthetargetdistributioni.e.alargemodelrequiresalargeamountoftrainingdatathatisi.i.d.accordingtothetargetdistribution.Inadditiontoinformingpredictionaccuracy,dataindependenceandsizehelpscomplexDNNmodelsavoidoverfittingandsimplymemorizingthetrainingdata.However,collectionsofphysicalmeasurementsareoftenlimitedbytherealitiesofsensoravailabilityandphysicalobstruction.Forexample,weatheraffectsopticalastronomicalmeasurementsbyintroducingcorrelatednoiseorblockingthedesiredobjectfromviewentirely.Furthermore,itisingeneraldesirabletodesignamodelthatcanbealternativelyappliedtoseveraldifferentRSOinferenceproblems,i.e.severaldistincttargetdistributions.Atransferlearningapproach-onewherethemodelisatleastpartiallytrainedondatafromapossiblydifferentdistributionthanthetargetdistribution [tan2018survey]-couldaddressthisneedforalargevolumeoftrainingdata.Indeed,transferlearninghasbeenappliedtosatelliteapertureradarimages [rostami2019deep],radiofrequencyinterferencecharacterization [lefcourt2022space],andclassifyingRSOsusinglightcurves [furfaro2018space].However,transferlearningultimatelyreliesonfeatureslearnedfromapotentiallyunrelateddataset,whichmayleadtoinefficientorinaccurateconclusions.GenerativemodelssuchasVariationalAutoencoders(VAEs)andGenerativeAdversarialNetworks(GANs)areadditionalalternativesfromthemachinelearningliteraturethataddressdataefficiency.VAEsareautoencodersthatimposestructureupontheirlearnedencoderanddecoderfunctionstoensurethatthelatentrepresentationofthedataencodesthetargetdistribution.ResearchershavesuccessfullyappliedVAEstolearnshapesfromthelightcurvesofRSOs [furfaro2019shape].GANsconverselysimultaneouslytrainmodelsthatgeneratesamplesmeanttomimicthetargetdistributionwhiledistinguishingbetweenrealandsyntheticsamples.PractitionershaverecentlyusedGANstoclassifystarsbasedupontheirlightcurves [garcia2022improving].However,GANscanbechallengingandexpensivetotrain.GANsinferadistributionfromaverysmalltrainingdatasetandcansufferfrommodecollapse,non-convergenceorgeneralinstability.Thisbringsustothefourthpoint—DNNshavedifficultyquantifyingtheuncertaintyintheirpredictions.ItisgenerallydifficulttodeterminewhenaDNNisextrapolating—predictingtheresponseondatafromadifferentdistributionorinadifferentregionofthetrainingdata.Thisisproblematicinscientificapplicationsbecauseoverconfidentpredictionscanleadtoincorrectinferenceandunnaturalresultsthataredifficulttointerpret.Furthermore,itisimportantfordecisionmakerstodistinguishbetweenlowandhigh-confidencepredictionswhendrawingconclusionsforSDA.Althoughuncertaintyquantificationmethodsareinresearch,mostpracticalmodelsemployedbypractitionersprovideonlypointpredictionswithnoobviousmechanismtomeasuremodelconfidence.RecentattemptstoprovideuncertaintyquantificationtoDNNsattempttolearnpredictionintervals(i.e.errorbars)inadditiontopointpredictions,butthisliteratureisstilldevelopingandthereisasyetnoconsensussolution [kabir2018neural].OthersattemptahybridapproachwhereaGaussianprocessisappendedtothelastlayerofaDNN [bradshaw2017adversarial].InthismanuscriptweproposeGaussianProcess(GP)modelsasalternativesforthelightcurvemodelingproblem.LikeDNNs,GPsarerepresentationlearningmethodsthataredata-drivenuniversalfunctionapproximators.AGPisatypeofkernelmethodthatimplicitlyandnon-linearlymapsinputfeaturesintoapossiblyinfinitelydimensionalinnerproductspace.Kernelmethodsuseaparameterizedkernelfunctiontocheaplycomputeinnerproductsinthisimplicitspacetopredictunknownresponses.GPscanalsobethoughtofasthegeneralizationofamultivariatenormaldistribution,wherealldatawithinadefineddomainisassumedtobejointlynormallydistributedwiththecovariancedefinedbythekernelfunction.Therefore,predictionsfromGPmodelsareprobabilisticallydefinedthroughconditionalprobability.GPsareattractiveforscientificapplicationsduetothisinherentlyBayesianinferencemodel,whichproducesexplicituncertaintyquantification(Gaussiandistributions)ofitspredictions.Furthermore,GPshavebeenshowntooutperformDNNsindata-starvedregimes [muyskens2022star]andareidealmodelsforlow-dimensionalfeaturespaces [muyskens2021muygps].However,conventionalGPshaveverypoorscalinginthenumberoftrainingobservations,whichhaspreviouslylimitedtheirapplicationtothelightcurvemodelingproblem.BothrealizingthepredictionsfromtheGPmodelandevaluatingthelikelihoodfunctionintrainingrequirecubiccomputationinthenumberofdatapoints,andrequirequadraticmemorytostorethekernelmatrix.WhilethispracticaldrawbackhastendedtolimittheapplicationofGPstosmalldataproblems,scalableapproximateGPmethodshaveproliferatedintheliterature [heaton2019case, liu2020gaussian].Thesemethodstypicaltradeoffaccuracyforspeed.However,theapproximateGPmethodMuyGPsoffersaselectionofsettingsthatdemonstratessuperioraccuracyorcomputationalscalinginseveraldatasets[muyskens2021muygps, muyskens2022star, buchanan2022gaussian].Therefore,inthispaperweusetheMuyGPsapproximateGPestimationmethod [muyskens2021muygps].MuyGPsusesnearestneighborssparsificationandabatchedleave-one-outcross-validationobjectivefunctiontotrainaGP-likemodelinnearlylineartimeinthenumberofdataobservations.InvestigatorshavesuccessfullyappliedMuyGPstocosmologyimageprocessingproblems [goumiri2020star, muyskens2022star, buchanan2022gaussian].Inthispaper,weproposeamethodoflightcurveinterpolationandpredictionusingMuyGPsthatgivesbothpredictionsanduncertaintyquantificationofthosepredictions.Weusethismethodintwomodes:
-
Interpolation,thepredictionofunobservedmagnitudesinthepast,and
-
Forecasting,thepredictionofunobservedfuturemagnitudes.
Interpolationisusefulasadatapreprocessingutilityforcatalogs.Interpolationallowsustofillinmagnitudeobservationsforsparse,irregularlightcurvestomakethemsuitablefordownstreamcomparisontaskssuchasshape,pose,size,type,ormaterialestimation.Forecastingisusefultodetectdeviationsfromexpectedfuturepatterns-of-lifethatcorrespondtoanomalies.TheposteriorvarianceprovidedbyGPsisespeciallyusefultodeterminewhatconstitutesasignificantdeviationfromexpectedbehavior.Insection 2,weprovidebackgroundonthelightcurvedataitself,dataprocessing,aswellasthemachinelearningmethodsweutilizeinourcomparativestudy.Theninsection 3wedescribeseveralnumericalstudiesthatdemonstratethecomparativeperformanceofseveralchoicesofourGPmethodtothatofaDNNandhowtheuncertaintyquantificationprovidedbyourmethodcanbeusedtodetectanomalies.Finally,insection 4,wediscussourconclusions,theimportanceofourfindings,limitations,andfutureworkassociatedwithourmethodology.
2 Methodology
2.1 Lightcurvesdefinition
Lightcurvesaretimeseriesofthebrightnessofresidentspaceobjectsoverlongperiods.Theyareobtainedbyobservingthesameobject,nightafternight,forseveraldaysorevenyears.Onewaytovisualizethemisintwodimensionswithoneaxisrepresentingthetimeofday(orsolarphaseangle)andtheotheraxisrepresentingthedayswhilethepseudo-colorindicatesthebrightness,asinFig. 1.
LightcurvesofRSOscontainpotentiallydetailedinformation.Forexample,onecoulddetectdustaccumulatingonhighlyreflectivesolarpanelsanddeducetheageandlifeexpectancyofasatellite.Higherintensityreflectionsalsohavethepotentialtoinformabouttheshapeoftheobservedobjectsincedifferentfacetsofamulti-facetedobjectwouldreflectdifferently.Inaddition,satellitescanmaneuverorrotate,inwhichcase,thereflectedlightwilldeviatefromtheexpectedpattern.ThosedeviationscanbedetectedandincreaseSDAknowledge.Observationscanbelimitedbythetimeofday,weather,sensorsaturation,andeclipses(seeFig. 2).Dataisonlyavailableatnight,andthemeasurementsarelessaccuratenearduskanddawn.Theweathercanprecludeobservationsforhoursandsometimesdaysatatime.Whenthereflectedlightfromthesunisparticularlybright,sensorscansaturateleadingtomissingdata,thoughthisparticularcaseisusuallyfairlyobviouswhenlookingatthebrightnessofsurroundingavailabledata.Lastly,periodically,thelightfromasatellitewillbeeclipsedbytheearth.
TotestourcodesweusealightcurvedatasetprovidedbyDaveMonet[monetgrams]of43satellites.Allsatellitesinthedatasetarefromthepubliccatalog(www.space-track.org).Weselected13distinctRSOs,thosemanuallyflaggedbyDaveMonetasnominal,forouranalysis.Thereareapproximately500,000datapointsperobject.AllobservationsweretakenfromasinglecamerainFlagstaffAZbetweenSeptember2014andSeptember2018.Thedatasetcontainsthebrightness(magnitude;bandisapproximatelyJohnsonV)aswellasthemeasurementerrorinbrightness(uncertainty).
2.2 Processing
Totestourpredictioncapabilities,weselectaportionofeachlightcurveastestdata.Weusetimeperiodsrangingfromafewhours(3hours)tomultipleweeks(2weeks).Andweeitherselectthetestdataatarandomtimewithinthetimeseries(interpolation)orattheend(extrapolation).Toguaranteethattheselectedperioddoesnotfallatatimelackingobservations(e.g.duringthedayorduringacloudynight)werejectthosethathavefewerthan90%ofthedatawe’dexpecttofindinthattimeintervalifthedatawasuniformlydistributed.Inaddition,weapplythatsameconditiontotheprecedingintervaltoensurethatwe’renotaccidentallyinterpolatingforlongerthatwewouldexpect.Sincethelightcurveshavesomedailyandyearlyperiodicity,wecompareseveralembeddingsofthelightcurvedataintomulti-dimensionalspaces.Thereferenceisthe1Dmodelwhichisjusttheoriginaltimeseries.The2Dmodelhasonedimensionrepresentingtimesofdayasrealnumbersintheinterval,andanotherdimensionrepresentingdaysasintegers,startingatzeroonthefirstdayofobservation.The3Dmodelislikethe2Dmodelbutwithanadditionaldimensionforyears,asintegersbetween0and4,andwithamodifieddaysdimensionmodulo365.Notethatalltheinputdimensionsareeventuallynormalizedtotheintervalasiscustomary.Analternate2Dmodelwithyearandtimeofyearasopposedtodayandtimeofdaywasconsideredbutdismissedforsimplicityandfornotbeingasintuitiveandinteresting.Weobservedthatday-to-daycorrelationisstrongerthanyear-to-yearinourdataset.Notethatitismoretraditionaltomodelthistypeofdataperiodicityusingaperiodickernel,buttheMuyGPsestimationframeworkdependsonakernelsparsitythatisultimatelynotvalidatedwiththatkerneltype.Therefore,theseembeddingsareanovelwaytoreplacesuchaperiodickernelwhilemaintainingthecomputationallyefficientframework.
2.3 Gaussianprocesses
Wewillconsiderthroughoutalightcurvetobeaunivariateresponse,whereistheobservationspacealongtimedimensions.Inpreprocessingwede-trendthedata,andsothereforehaszeromean.InmodelingwithaGP,weassumethatitisdrawnfromaGaussianprocessdistribution.ThismeansthatfollowsamultivariateGaussiandistributionatanyfinitesetofpoints.However,inrealitymeasurementnoiseperturbsourobservedvaluesofatlocations.Weassumethateachmeasurementisperturbedbyhomoscedasticnoisethatarei.i.d..Thatis,
(1) | ||||
(2) |
whereisthemultivariateGaussiandistribution,isthe-dimensionalzerovector,isavariancescalingterm,istheidentitymatrix,andisanpositivedefinite,symmetriccovariancematrixbetweentheelementsofthatiscontrollednon-linearlythroughkernelfunctionwithhyperparameters.isthatisperturbedbythemeasurementnoisepriorandscalingparameter.Similarly,anyadditional(possiblyunobserved)datumisalsojointlynormalwithobserveddatabytheGPassumption.Thus,theconditionaldistributionfortheresponseofatgivenresponsesobservedatandisalsomultivariatenormalwithmeanandvariance
(3) | ||||
(4) |
whereisthecross-covariancematrixbetweenandtheelementsof.GPsaretypicallytrainedbymaximizingthelog-likelihoodoftheobservationswithrespectto.Thislog-likelihoodfunctionpossessesthefollowingform:
(5) |
However,evaluatingEquation 5iscomputationandinmemory,whichisintractableforallbutrelativelysmalldata.AMuyGPsmodelrewritesequations 3 and 4as
(6) | ||||
(7) |
wherearethenearestneighborsofin.MuyGPstrainsintermsofsomelossfunctionoverasampledbatchofsizebyminimizinganobjectivefunction.MinimizingallowsustotrainwithoutevaluatingtheexpensivedeterminantintheGPlikelihood.Whenwesettoleave-one-outcross-validationandtomeansquarederror,trainingreducestominimizingthefunction
(8) |
Noteotherlossfunctionscanbeemployedinthisframework,butthismeansquarederrorfunctionhasbeendemonstratedasperformant [muyskens2021muygps].WeuseBayesianoptimizationtooptimizeEquation 8totraininourexperiments.
2.4 Astandardneuralnetworkmodelforbenchmarking
Neuralnetworkshaveachievedstate-of-the-artresultsoncommonbenchmarkproblemsandarenowubiquitousinscientificapplications.AlthoughthefocalpointofthisworkistheapplicationofGaussianprocessestolightcurvemodeling,webenchmarkthepredictiveperformanceofMuyGPsagainstastandarddeepneuralnetworkmodeltoensureitspredictiveaccuracy.Weselectafully-connectedarchitecture,asthestructureofthefeaturespaceisnotsuitedtoconvolutionalmodels.Moreover,recurrentneuralnetworks,includingLSTMs,areanotherpopularclassofmodelsthatareusefulinforecastingsettings,butnotinterpolationproblems.Fully-connectedneuralnetworkscanberepresentedasthecompositionofaseriesofaffinetransformationsandnonlinearactivationfunctions.Thecompositionofanindividualaffinetransformationandnonlinearactivationfunctionconstitutesalayerofthefully-connectednetwork.Eachaffinetransformationinthenetworkisparameterizedbyweights,whereisthedimensionoftheinputtothelayerandistheoutputdimension,andabiasvector.Welabeltheactivationfunctioninthelayer.Letbeaneuralnetwork.Then,
(9) |
Wetrainafully-connectedneuralnetworkusingthedayoftheyearandtimeofdayastheindependentvariables(features),withthetargetsettothemagnitude(normalizedtotakeonvaluesbetween0and1bydividingbythelargestmagnitudeobserved).ThemodelarchitecturefeaturesReLUactivationfunctionsand5layersofsizes,,,,and.Therearetotalparametersinthemodel,roughlythenumberoftrainingsamplesdependingonthetestcase.Wetraintheneuralnetworkinbatchesofsize128usingtheAdamoptimizerfor100epochs.Weusethemean-squarederrorlossfunctionwith20percentofthetrainingdatawithheldforvalidation,thetrainingratesetto,andalearningrateexponentialdecayfactorof0.5appliedevery30trainingepochs.Wesettheratioofthenumberofparametersinthemodeltothenumberofdatapointsontheorderof,acommonlyacceptedratiointhedeeplearningcommunity.Weselectedthesetraininghyperparametersbasedoncommonvaluesusedindeeplearningtoachievethebestcombinationoftraininglossdecayandvalidationerrorminimization.
3 Results
FirstwecomparethepredictingpowerofGaussianprocessesforthedifferentembeddingsoftheinputdatadescribedinsubsection 2.2.Foreachofthe13lightcurvesinourdataset,werandomlyselectatotalof20timeintervals,5foreachtimedurationin{3hours,1day,1week,2weeks},andusethoseintervalsastestdatawhileusingtherestastrainingdata.TheaccuracyofthepredictionismeasuredastheRootMeanSquareError(RMSE)dividedbytheextent(max-min)ofthemagnitudeinthetestdata.Weusethecommonradialbasisfunction(orRBF)todefineourkernelinallofourexperiments.Thismeansthatweusethekernelfunction
(10) |
TheRBFkernelhaslengthscaleparameter,andourmodeladditionallyhasthemeasurementnoisevariancepriorandvariancescalingparameter.Wefixinourexperiments.WetrainbyoptimizingequationEquation 8viaBayesianoptimization.However,noprediction-basedobjectivefunctionissensitiveto,andsoweoptimizeitaccordingtotheanalyticproceduredescribedinSection2of[muyskens2021muygps].Fig. 3showsthedistributionoftheresultingcomparisonofthepredictionstothetruthforeachembedding.Withamedianof0.066,the2Dembeddingyieldsmorethan3timesbetterpredictionsthantheoriginal1Dembedding(median:0.24).Thebetterpredictionispossiblebecausethedistancebetweensimilardatapointsacrossdaysisreduced,thankstotheextradimension,allowingtheGPtomorereadilyusethemduringinterpolation.Howeverweseethataddingathirddimensionencodingtheyearlyperiodicitydoesnotyieldsignificantimprovementsoverthe2Dembeddingsincetheresultsarequasi-identical(median:0.065).Althoughthedistributionsof2Dand3Dembeddingsaresimilar,fromthison,wefocusonthe2Dembeddinganduseitexclusively.
Partofthemotivationforbeingabletopredictdatainlightcurvesistocompletegapsandmissingdatapoints.Afirstbenchmarkistobeabletocompleterandomlyselecteddatapointsthroughoutalightcurve.Werandomlyselecteither5%,10%,or20%ofthetotalnumberofdatapointavailableineachofthe13lightcurvesinthedataset,5timesforeachproportion.Then,formorerealisticinterpolation,weselectconsecutivedatapointsrepresentinggapsofeither3hours,1day,1week,or2weeks,with5differentrandomstartinginstantsforeach13lightcurve.LastlywerepeatthesameprocedurebutbyselectinggapsattheendofeachlightcurvetoevaluatehowGPsperformforextrapolation.Fig. 4showstheperformanceofGPpredictionsforalloftheseinterpolationandextrapolationtasks.Forrandominterpolation,forallpercentages(5%,10%,or20%),theperformanceissimilarandverygoodwithamedianrelativeRMSEofabout0.03.Forgaps,eitherinterpolationorextrapolation,thereismorevariation,bothforagivengapduration,andacrossdurations.
TocomparetheperformanceofGPscomparedtoDNNs,weranthesametaskofpredictingonedayofmissingdatainasingle4year-longlightcurvewithtrainingdataexclusivelyfromthatsinglelightcurveonthesamemachine.AsshowninFig. 5,GPsachievebetteraccuracyinonlyafractionofthetime.
Fig. 6illustratesasingleexampleofthequalityoftheGPpredictioncomparedtorawmeasurementsforadayofmissingdatausingthe3Dembedding.Themeanpredictionfollowsthetrendofmagnitudeswellasexpected.AsistypicalinGPmodels,ourkernelmodelisassumedtobestationaryandhomoscedastic,meaningthatwelearnasinglesetofhyperparametersthatbestmodelsallthetrainingdata(acrossallmagnitudes).Inthisintervalpictured,weseethatthedataathighmagnitudesseemtohavemorevariancethanthedatacollectedatlowermagnitudes.Ourmodelisagnostictoanychangeinvarianceinthedataitself,butwillonaverageprovidedesireduncertaintyquantification.Notethisonetimegapisasingleintervalthatisinitselftime-correlatedinmagnitudeandthereforevarianceregime.Therefore,thecoverageofthe95%intervalfromthisonesamplecanbedifferentthanthedesiredoveralllevel,butwhenweconsidermanyindependenttimeintervalsofthistype,theuncertaintieswillaveragetothedesired95%confidencelevel.InfutureworkmoreflexibleandcomplexGPkernelfunctionscouldbedesignedwithinhomogeneousornon-stationarykernelstocorrectforthispotentialpatterninmagnitudetoimproveuncertaintiesineachindividualsampleregion.Ifthemagnitudedistributioninourpredictionintervaldiffersfromthatofourtrainingdatathenthereisamodeldiscrepancyinourapproachthatcouldexplainourobservedvariances.
ButthetruepowerofGPscomesfromtheirabilitytopredictafullposteriordistributionandnotjustamean.Thepredictedcovariancecanbeusedtodetectareaswherethepredictiondiffersfromthemeasurementbyasignificantamount.OneapplicationofparticularinterestforSDAisthedetectionofanomalies—potentialmaneuversorstatechanges—inlightcurvedata.Fig. 7showstwoexampleofsuchanomalies.Inbothcases,twofactorsmaketheanomaliesnoticeable.Firstly,thepredictiondiffersfromthemeasurementforasmallbutsignificantperiodoftime,andthedifferenceexceedstwotimesthepredictedstandarddeviation.Andsecondly,themeasurementsfromthesurroundingdayslooksimilartothoseofthefocalday—andtotheprediction—exceptforthatparticularperiod.Thecombinationofthosetwofactorsindicatethatitisindeedthemeasurementsthatareanomalousandnotthepredictions.
4 Conclusion
WehaveshownGaussianprocesses(GPs)usingMuyGPtobeacapableandpertinenttoolforanalyzinglightcurvedata.Unlikeothermachinelearningmethodsthatcanhavemillionsofparameters,Gaussianprocessesaresimpleinthattheydonothavealargearchitecturesearchandinthattheyonlyhaveafewparameterstobeestimated.UnliketraditionallyGPestimationmethods,theimplementationoftheMuyGPsmethodologyallowsastronomerstoapplyGPmethodstoverylargedatawithverylittlecomputationalormemoryburden.Inadditiontotheirimprovedaccessibility,wehaveshownthatinourdesignedexperimentsonlightcurves,Gaussianprocessesarebothmoreaccurateandfasterthananexampleneuralnetwork.Further,theuncertaintyquantificationwegetfromGaussianprocessesallowsustoidentifystatisticallydistinctdeviationsfromthepattern-of-life.Asseenabove,whileGaussianprocessesareabletoingestrawuni-dimensionaltimeseries,encodingtheperiodicityofthelightcurvedatathroughadditionaldimensionsyieldsmuchbetterpredictionswhenwepredictlargegapsofobservations.Thisislikelyduetothedistinctperiodictrendsinthelightcurves.However,weobservedthatencodingthedailyperiodicitywassufficientandthatencodingtheyearlyperiodicitydidnotfurtherimproveprediction.Afirstexplanationisthatsurroundingdaysaretypicallymoresimilarthansurroundingyears,sothemajorityof“nearestneighbors”usedintheGPmodelwouldnaturallybeselectedfromsurroundingdaysratherthanfromsurroundingyears.WehavealsoshownhowGaussianprocessescompareadvantageouslytoneuralnetworksbothintermsofcomputingtimeduringtrainingandpredictionaccuracy.Notethatourmetricforcomputingtimecombinesbothtrainingandpredictiontimefromthetwomodels.ThismetricmasksthetradeoffthatatrainedneuralnetworkcanmakeadditionalpredictionsefficientlywhereastrainingaGaussianprocessusingMuyGPsisveryfast,butthepredictiontimegivenatrainedmodelismuchslowercomparatively.[fastmuygps]demonstrateawaytoimprovepredictiontimeofasimlarkernelinterpolationwithanadditionalapproximation.However,intheonlinescenarioofthelightcurveproblemwhereonemaywanttoperformthisanalysisinnearrealtime,onewouldbothretrainandpredictsimultaneouslysotheGPmethodissignificantlypreferred.BothmethodscanbeparallelizedtotakeadvantageofHighPerformanceComputing(HPC).ForDNNs,parallelizationstechniquesarereadilyavailable,forinstanceinPyTorch,withtheadventofGPUsandTPUs.ForMuyGPs,parallelizationseffortsareinprogressandwillbereleasedinadifferentpublication.WeranalloftheexperimentsinthismanuscriptonasinglecoreofacommoditylaptopusingthesoftwarelibraryMuyGPyS [muygpys2021github].FutureapplicationswilluseMPI [dalcin2011parallel]andJAX [jax2018github]toscalemodeltrainingandevaluationtomultiplecomputenodesusingmanyCPUsandGPUsinparallel.ThesescalabilityfeatureswillenableapplicationsthatscaletotheobservationsizesofLSST,whichareanticipatedtoinvolvehundredsofmillionsofspaceobjects.PossiblefuturedirectionsofresearchinvolvecomparingGPtrainingfromonesetoflightcurvesandusingthesetrainedparameterstomakepredictionsforadifferentsetofunseenlightcurves.Thiswilldeterminewhetheritisnecessarytofitparameterstoeachlightcurve.Finally,sinceourmethodgivesfullyinterpolatedlightcurves,futureworkcouldusethesede-noisedfulltimeseriesforfurtherSDAstudies.Insummary,webelieveourGPmethodisavaluablemethodforlightcurveinterpolationandfuturepredictionforthreescientifictasksofparticularinterest.First,byde-noisingandinterpolatingmissingdata,GPscanbepartofapre-processingpipelinebeforefeedingthedatatofurtheralgorithmsorsoftwarethatcannotworkwithrawdata.Asshowninfigures 1and 2,rawlightcurvescanhavealotofgapsfrequentlyspanningmultipledayssobeingabletointerpolatethosegapsiscrucial.Second,bybeingabletoextrapolatedataoverseveralhours,days,orevenweeks,GPscanbeusedtopredicttheexpectedbehaviorofRSOs,whichcanserveinforecastingandcollisionavoidanceinSDA.Lastandnotleast,byquantifyingtheuncertaintyofpredictions,GPsallowforautomaticchangedetection.Indeed,beingabletopredictafullposteriordistributionenablesdiscerninganomaliesinmeasurementsfromanomaliesinprediction,sinceuncertainpredictionscaneasilyleadtofalsepositives.Infutureresearch,itshouldbepossibletousetrackingdataand/ordocumentedknownmaneuverstofine-tuneandbenchmarkourmaneuverdetectioncapabilities.
Acknowledgments
ThisworkwasperformedundertheauspicesoftheU.S.DepartmentofEnergybyLawrenceLivermoreNationalLaboratoryunderContractDE-AC52-07NA27344withIMreleasenumberLLNL-PROC-839253.FundingforthisworkwasprovidedbyLLNLLaboratoryDirectedResearchandDevelopmentgrant22-ERD-028.ThisdocumentwaspreparedasanaccountofworksponsoredbyanagencyoftheUnitedStatesgovernment.NeithertheUnitedStatesgovernmentnorLawrenceLivermoreNationalSecurity,LLC,noranyoftheiremployeesmakesanywarranty,expressedorimplied,orassumesanylegalliabilityorresponsibilityfortheaccuracy,completeness,orusefulnessofanyinformation,apparatus,product,orprocessdisclosed,orrepresentsthatitsusewouldnotinfringeprivatelyownedrights.Referencehereintoanyspecificcommercialproduct,process,orservicebytradename,trademark,manufacturer,orotherwisedoesnotnecessarilyconstituteorimplyitsendorsement,recommendation,orfavoringbytheUnitedStatesgovernmentorLawrenceLivermoreNationalSecurity,LLC.TheviewsandopinionsofauthorsexpressedhereindonotnecessarilystateorreflectthoseoftheUnitedStatesgovernmentorLawrenceLivermoreNationalSecurity,LLC,andshallnotbeusedforadvertisingorproductendorsementpurposes.