Stripping Embedded MP4s out of Android 12 Motion Photos

Motion Photos is a feature of the Google Camera application on some Android devices — Pixel only, as far as I know — that records a short video at the same time a picture is taken in order to offer you a better shot of the same moment. The Google Camera application analyzes the frames in the video to find one where everyone’s eyes are open, everyone is smiling, etc. Implementation wise, it works by embedding an MP4 video inside the image container.

This feature is nice for the quick sharing of photos to family and friends with the caveat that the stills from the video are lower quality than the actual image (so you wouldn’t want to depend on them for any serious use). Furthermore, Motion Photos are not supported anywhere outside the Google ecosystem so in effect they are just standard JPEGs with extra fat taking up space for a vendor-specific implementation you’re probably never going to use.

Stripping the Embedded Videos

For long-term archival purposes I’d rather strip out the embedded video and be left with normal JPEGs that can be read on any system — now or in the future — with open source software. The method for doing this is trivial and has been known for some time: seek to the position in the file where the MP4 header starts and truncate the file from that point forward. I have been using this method in an image archiving pre-process script for a few months, but it stopped working when I updated my phone to Android 12 Beta.

As of Android 12 it appears that the MP4 header in Motion Photos has changed from ftypmp4 to ftypisom. For completeness, my pre-processing script now checks for both of these as well as another (older) variation. The relevant excerpt from the script is:

for file in PXL_*.MP.jpg; do
    # Don't crash when there are no files matching the glob
    [ -f "$file" ] || continue

    # Check MP4 header, newer versions first
    unset ofs
    for header in 'ftypisom' 'ftypmp4' 'ftypmp42'; do
        ofs=$(grep -F --byte-offset --only-matching --text "$header" "$file")

        if [[ $ofs ]]; then
            truncate -s $((ofs-4)) "$file"

            # Go to next image

This works well and handles a few corner cases, though I seem to have stumbled upon another mystery.

Motion Photos without Embedded Videos

An unsolved mystery is that many images appear to be Motion Photos on the device, but don’t contain an embedded video. Attempting to read the ftypisom header in one such file causes grep to exit with a non-zero return code:

$ grep -F --byte-offset --only-matching --text ftypisom \
$ echo $?

According to exiftool the file does have Motion Photo metadata:

$ exiftool -xmp:all PXL_20210910_075209522.MP.jpg
XMP Toolkit                     : Adobe XMP Core 5.1.0-jc003
Motion Photo                    : 1
Motion Photo Version            : 1
Motion Photo Presentation Timestamp Us: 1245580
Has Extended XMP                : 88CC75F8F29E9AEB6DF8089BE44A6C4B
Directory Item Mime             : image/jpeg
Directory Item Semantic         : Primary
Directory Item Length           : 0
Directory Item Padding          : 0
HDRP Maker Note                 : (Binary data 23159 bytes, use -b option to extract)

I poked around the file with a hex editor for a few minutes, but wasn’t sure what to look for. Someone who knows what they are looking for would find exiftool -htmldump enlightening. Maybe someone will figure this mystery out (or a subsequent Android release will fix it). For now I’m stuck with this half-working solution.