You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Train a model using e.g. ImportanceCmd: $./bin/variant-spark --local -- importance -if data/chr22_1000.vcf -ff data/chr22-labels.csv -fc 22_16051249 -rn 10 -rbs 10 -om target/ch22-model.java -sr 13 -v
Then load that model using e.g. AnalyzeRFCmd: $./bin/variant-spark --local -- analyze-rf -im target/ch22-model.json
Gives the following exception:
java.io.StreamCorruptedException: invalid stream header: 7B0A2020 at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:900) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.<init>(JavaSerializer.scala:63) at org.apache.spark.serializer.JavaDeserializationStream.<init>(JavaSerializer.scala:63) at org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:122) at au.csiro.variantspark.cli.AnalyzeRFCmd$$anonfun$1.apply(AnalyzeRFCmd.scala:81) at au.csiro.variantspark.cli.AnalyzeRFCmd$$anonfun$1.apply(AnalyzeRFCmd.scala:80) at au.csiro.pbdava.ssparkle.common.utils.LoanUtils$.withCloseable(LoanUtils.scala:18) at au.csiro.variantspark.cli.AnalyzeRFCmd.run(AnalyzeRFCmd.scala:80) at au.csiro.sparkle.common.args4j.ArgsApp.run(ArgsApp.java:46) at au.csiro.sparkle.cmd.CmdApp.runApp(CmdApp.java:9) at au.csiro.sparkle.cmd.CmdApp.runApp(CmdApp.java:18) at au.csiro.sparkle.cmd.MultiCmdApp.runCommandOrClass(MultiCmdApp.java:58) at au.csiro.sparkle.cmd.MultiCmdApp.run(MultiCmdApp.java:54) at au.csiro.sparkle.cmd.CmdApp.runApp(CmdApp.java:9) at au.csiro.sparkle.cmd.CmdApp.runApp(CmdApp.java:18) at au.csiro.pbdava.ssparkle.common.arg4j.AppRunner$.mains(AppRunner.scala:17) at au.csiro.variantspark.cli.VariantSparkApp$.main(VariantSparkApp.scala:26) at au.csiro.variantspark.cli.VariantSparkApp.main(VariantSparkApp.scala)
This is because we can output trained models as json, but currently don't handle json format for input models.
I suggest creating a ModelInputArgs to mirror ModelOutputArgs, and add support for reading regular json files as an instance or RandomForestModel.
The text was updated successfully, but these errors were encountered:
Steps to reproduce:
Train a model using e.g. ImportanceCmd:
$./bin/variant-spark --local -- importance -if data/chr22_1000.vcf -ff data/chr22-labels.csv -fc 22_16051249 -rn 10 -rbs 10 -om target/ch22-model.java -sr 13 -v
Then load that model using e.g. AnalyzeRFCmd:
$./bin/variant-spark --local -- analyze-rf -im target/ch22-model.json
Gives the following exception:
java.io.StreamCorruptedException: invalid stream header: 7B0A2020 at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:900) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.<init>(JavaSerializer.scala:63) at org.apache.spark.serializer.JavaDeserializationStream.<init>(JavaSerializer.scala:63) at org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:122) at au.csiro.variantspark.cli.AnalyzeRFCmd$$anonfun$1.apply(AnalyzeRFCmd.scala:81) at au.csiro.variantspark.cli.AnalyzeRFCmd$$anonfun$1.apply(AnalyzeRFCmd.scala:80) at au.csiro.pbdava.ssparkle.common.utils.LoanUtils$.withCloseable(LoanUtils.scala:18) at au.csiro.variantspark.cli.AnalyzeRFCmd.run(AnalyzeRFCmd.scala:80) at au.csiro.sparkle.common.args4j.ArgsApp.run(ArgsApp.java:46) at au.csiro.sparkle.cmd.CmdApp.runApp(CmdApp.java:9) at au.csiro.sparkle.cmd.CmdApp.runApp(CmdApp.java:18) at au.csiro.sparkle.cmd.MultiCmdApp.runCommandOrClass(MultiCmdApp.java:58) at au.csiro.sparkle.cmd.MultiCmdApp.run(MultiCmdApp.java:54) at au.csiro.sparkle.cmd.CmdApp.runApp(CmdApp.java:9) at au.csiro.sparkle.cmd.CmdApp.runApp(CmdApp.java:18) at au.csiro.pbdava.ssparkle.common.arg4j.AppRunner$.mains(AppRunner.scala:17) at au.csiro.variantspark.cli.VariantSparkApp$.main(VariantSparkApp.scala:26) at au.csiro.variantspark.cli.VariantSparkApp.main(VariantSparkApp.scala)
This is because we can output trained models as json, but currently don't handle json format for input models.
I suggest creating a
ModelInputArgs
to mirrorModelOutputArgs
, and add support for reading regular json files as an instance orRandomForestModel
.The text was updated successfully, but these errors were encountered: