Skip to content

SageSELab/V2S

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video2Sceneario

Video2Scenario (V2S) is a modular scenario replay tool that translates a screen-recorded mobile application usage scenario into a replayable scenario on a target device. V2S consists of three phases, each of which play a crucial role in accomplishing this functionality. Given that V2S was designed and built with extension in mind, each of these phases and each of the component elements of these phases can be extended or substituted to further the functionality of the pipeline as a whole. The intent of this design choice is to allow researchers and developers to customize V2S for future projects or development use cases.

Phase 1: Frame Extraction and Touch Detection

The purpose of this step is to extract the individual frames from the input video and detect the location and opacity of the touches exhibited in these frames. Phase 1 executes three components: (i) the FrameExtractor, (ii) the TouchDetectorFRCNN, and (iii) the OpacityDetectorALEXNET.

The FrameExtractor first standardizes the videos to 30 frames per second, and then extracts the individual frames using the Ffmpeg library [1]. Then, the TouchDetectorFRCNN applies a modified Faster R-CNN model (trained utilizing the Tensorflow Object Detection API) to each frame and accurately predicts the location of the bounding box of the touch indicators found, if any [2]. Finally, after the touches have been localized by the TouchDetectorFRCNN, the ObjectDetectorALEXNET crops each touch around its bounding box and then feeds these into the Opacity CNN, which is an extended version of an ALEXNET CNN. This architecture classifies each touch as having high or low opacity. Phase 1 then pairs the touch locations detected by the TouchDetectorFRCNN with the opacity values predicted by the OpacityDetectorALEXNET to form a complete detections list.

Phase 2: Action Classification

Phase 2 of V2S groups the touches detected in Phase 1 and translates them into a set of actions. Phase 2 reads the touches detected in Phase 1 and executes the GUIActionClassifier.

The GUIActionClassifier executes two distinct steps to classify the depicted actions accurately: (i) an action grouping step to organize individual touches across consecutive frames into discrete actions, and (ii) an action translation step which allows for these groupings to be associated with a specific action type. The output of this component is a list of detected actions that can be written to a JSON file.

Phase 3: Scenario Generation

Phase 3 of V2S translates the detected actions into a script that can be replayed on the target device. Phase 3 reads in the actions specified by Phase 2 and begins the execution of the Action2EventConverter.

The Action2EventConverter converts the high-level actions produced by Phase 3 into low-level commands in the sendevent format. Then, the script generated by the Action2EventConverter is converted by the Translator into a format that is executable on the target device. Once this translation has occurred, V2S pushes the executable file and a modified version of the RERAN script to the device and executes a screen recording of the generated scenario.

Setup Instructions

Prerequisites

  • Python 3.6.9 installed
    • Newer versions will not work with required version of TensorFlow.
    • If none installed yet, can use Anaconda/Miniconda as mentioned below
  • git (with git lfs) installed
  • adb installed
    • instructions
    • After installing, be sure to add the executable's path to your v2s configuration file.
  • Enable USB debugging on your physical device/emulator

Installing Anaconda/Miniconda

  • We will use conda to manage python package dependencies.
  • We recommend you install Miniconda from here or Anaconda from here.
    • select "Add Anaconda to PATH" option during the install (more than one path variable is needed and this option takes care of them all).
  • If freshly downloaded, you have python 3.7 or newer. As mentioned earlier, downgrade to python 3.6.9 using 'conda install python=3.6.9'.

V2S Installation

  • Ensure that the environment you are running in is operating with Python 3.6.9.
  • Current Option:
    • Clone the repository here, navigate to python_v2s directory, and execute pip install .. To find auxilary files necessary for running v2s, navigate to the path specified by sys.prefix and find v2s.
    • Run pip show v2s to ensure that v2s has been installed and to locate the v2s package on your system. By navigating here, you can find the packages necessary for v2s. To find auxilary files necessary for running v2s, navigate to the path specified by sys.prefix and find v2s.
  • To be implemented at a later date:
    • pip install v2s

Execution

  • Update v2s_config.json to list all of the video scenarios to be analyzed, or create a new configuration file following this same structure and specify the path with the --config option. Ensure that the detection models are correct depending on your device and that the application name is found in the app_config.json file.
  • If necessary, update device_config.json to include your device specs, and update app_config.json to include the application apk and package information. The commands to determine the specs are as follows:
    • device - adb shell getevent -t
    • max_x, max_y - adb shell getevent -lp
    • width, height - adb shell wm size
    • X, Y, EV_ABS, X, Y, PRESS, TRACK_ID, MAJOR, EV_SYN, EV_KEY - adb shell getevent -t and abd shell getevent -lt
  • When you are ready to complete the analysis, run exec_v2s --config=<filename> where <filename> is the path to the json configuration file listing all of the video scenarios to be analyzed. If no config argument is specified, the default v2s_config.json file is used, which is located at sys.prefix with the v2s package. In order to create your own config files, follow the structure outlined in v2s_config.json by including those fields.

Troubleshooting Comments

  • If you are trying to screen record with adb on your device and get the error: Unable to get output buffers (err=-38) \ Encoder failed (err=-38) indicating that the resolution of your device is not supported by the encoder, try recording at a smaller resolution using the --size <width>x<height> option. More information can be found in the adb docs.
  • If you clone the repositories without first installing git lfs, run these on the project to resolve large files after installing git lfs:
    • git lfs fetch
    • git lfs pull
  • If protoc object_detection/protos/*.proto --python_out=. fails (for Windows) create and run a batch file in the same directory where you attempted to run the command with contents:
    • for %%f in (object_detection\protos\*proto) do protoc "%%f" --python_out=.
    • delete the batch file

References

  1. https://github.com/kkroening/ffmpeg-python
  2. https://github.com/tensorflow/models/tree/master/research/object_detection

Open Source Projects Used

  1. ffmpeg-python
  2. Tensorflow Object Detection API
  3. python-Levenshtein
  4. wand

Releases

No releases published

Packages

No packages published

Languages