{"id":34996444,"url":"https://github.com/apaz-cli/speed-challenge","last_synced_at":"2026-05-19T21:05:45.965Z","repository":{"id":110104074,"uuid":"373949397","full_name":"apaz-cli/Speed-Challenge","owner":"apaz-cli","description":"A computer vision model that can predict the speed of a car from a video taken from the inside via machine learning. The comma.ai speed challenge.","archived":false,"fork":false,"pushed_at":"2021-06-04T20:53:37.000Z","size":804,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-12-28T15:26:07.578Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apaz-cli.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-06-04T20:10:52.000Z","updated_at":"2024-05-08T22:33:26.000Z","dependencies_parsed_at":"2023-04-13T10:11:50.879Z","dependency_job_id":null,"html_url":"https://github.com/apaz-cli/Speed-Challenge","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/apaz-cli/Speed-Challenge","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apaz-cli%2FSpeed-Challenge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apaz-cli%2FSpeed-Challenge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apaz-cli%2FSpeed-Challenge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apaz-cli%2FSpeed-Challenge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apaz-cli","download_url":"https://codeload.github.com/apaz-cli/Speed-Challenge/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apaz-cli%2FSpeed-Challenge/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33233131,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-19T15:49:41.270Z","status":"ssl_error","status_checked_at":"2026-05-19T15:49:22.917Z","response_time":58,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-12-27T02:16:11.713Z","updated_at":"2026-05-19T21:05:45.960Z","avatar_url":"https://github.com/apaz-cli.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv class=\"cell markdown\"\u003e\n\n# The comma.ai Programming Challenge:\n\n\u003cbr\u003e\n\n### The goal of this challenge is to build a machine learning computer vision model that can predict the speed of a car from a video taken from inside.\\*\n\n\u003cbr\u003e\n\n  - data/train.mp4 is a video of driving containing 20400 frames. Video\n    is shot at 20 fps.\n  - data/train.txt contains the speed of the car at each frame, one\n    speed on each line.\n  - data/test.mp4 is a different driving video containing 10798 frames.\n    Video is shot at 20 fps.\n\n## Deliverable\n\nYour deliverable is test.txt. E-mail it to \u003cgivemeajob@comma.ai\u003e, or if\nyou think you did particularly well, e-mail it to George.\n\n## Evaluation\n\nWe will evaluate your test.txt using mean squared error. \\\u003c10 is good.\n\\\u003c5 is better. \\\u003c3 is heart.\n\n\u003cbr\u003e\n\n[\\*See the original repo for the original\nwording](https://github.com/commaai/speedchallenge)\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\n# My Solution:\n\nSince the amount of training data is so small, only about twenty\nthousand frames, which is less than the few hundred thousand I'd accept\nas a minimum, I think that neural networks are out of the question.\n\nIt would be possible to pretrain a neural network with external data\nsources and then tune the model on the test video, but don't think it's\nenough data, even to tune a pretrained model. In any case, I don't want\nto go that route. I think it's against the spirit of the competition.\n\nInstead, let's fall back to traditional Computer Vision methods. Let's\nuse a keypoint extraction algorithm (Oriented FAST and Rotated BRIEF) to\ntrack objects and points of interest, then train a regression model\n(TODO Choose regression model) to predict the speed of the car based on\nhow far each matching keypoint moves.\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\n# Preprocessing\n\n## Rip frames\n\nExport each frame of the video as an image. This may take a while.\n\n## Crop\n\nIn the provided training video (and presumably the video this will be\ntested on), the hood and tinted upper bit of the windshield obscure the\nview. Only an area of roughly (640, 320) with its upper corner at (0,\n34) is useful.\n\n## Slicing\n\nWe have no labels for our test dataset. To ensure that our model is\naccurate and can generalize well to the testing video, we need to\nreserve some more labeled data to validate it on. To double the number\nof examples, we can slice the cropped image in half, yielding two (320,\n320) square images. We'll reserve a small portion of our training\nvideofor validating our model.\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell code\" data-execution_count=\"1\"\u003e\n\n``` python\nimport os\nfrom time import time\nfrom glob import glob\nfrom PIL import Image\nimport matplotlib.pyplot as plt\nimport cv2\nimport pickle\nimport numpy as np\n%matplotlib inline\n```\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell code\" data-execution_count=\"2\"\u003e\n\n``` python\n# For the first run, set both of these to True.\n\n# Rip the video into frames, then crop and slice.\npreprocess = False \n\n# Extract image features from the sliced crops\nextract    = False\n```\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\n### Rip Frames\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell code\" data-execution_count=\"3\"\u003e\n\n``` python\nnum_trainframes = 20400\nnum_testframes = 10798\n\n# Create a folder if it doesn't exist\ndef mkdir(dir):\n    try:\n        os.mkdir(dir)\n    except:\n        pass\n\ndef video_to_frames(input_loc, output_loc):\n    mkdir(output_loc)\n    os.system(f'ffmpeg -i {input_loc} {output_loc}/%d_full.png')\n\nif preprocess:\n    video_to_frames('data/train.mp4', 'data/trainframes')\n    print(\"Ripped frames from train video.\")\n    video_to_frames('data/test.mp4',  'data/testframes')\n    print(\"Ripped frames from test video.\")\n\n\n# Show the first frame of the video file\ndef showFirstFrame(videofile):\n    _, image = cv2.VideoCapture(videofile).read()\n    #cv2.imshow(image)  # save frame as JPEG file\n    plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n    plt.axis('off')\n    plt.show()\nshowFirstFrame('data/train.mp4')\n```\n\n\u003cdiv class=\"output display_data\"\u003e\n\n![](/tmp/pandoc/426ca96712ab378b4f740ca85c37c0eec6be0219.png)\n\n\u003c/div\u003e\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\n### Crop and Slice\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell code\" data-execution_count=\"4\"\u003e\n\n``` python\n# Top left corner: (0, 34)\n# Width/Height: (640, 320)\ncropleft = (0, 34, 320, 320+34)\ncropright = (320, 34, 640, 320+34)\nwidth, height = (112,112)\n\nif preprocess:\n    for i in trange(0, num_trainframes):\n        framepath = f'data/trainframes/{i+1}_full.png'\n\n        img = Image.open(framepath)\n        img.crop(cropleft).save(f'data/trainframes/{i}_left.png')\n        img.crop(cropright).save(f'data/trainframes/{i}_right.png')\n        os.remove(framepath)\n    \n    for i in trange(0, num_testframes):\n        framepath = f'data/testframes/{i+1}_full.png'\n\n        img = Image.open(framepath)\n        img.crop(cropleft).save(f'data/testframes/{i}_left.png')\n        img.crop(cropright).save(f'data/testframes/{i}_right.png')\n        os.remove(framepath)\n\ndef showCroppedSlicedFrame():\n    plt.subplot(1, 2, 1)\n    plt.imshow(Image.open('data/trainframes/1_left.png'));\n    plt.axis('off')\n    plt.subplot(1, 2, 2)\n    plt.imshow(Image.open('data/trainframes/1_right.png'));\n    plt.axis('off')\n    plt.show()\nshowCroppedSlicedFrame()\n```\n\n\u003cdiv class=\"output display_data\"\u003e\n\n![](/tmp/pandoc/48b7b815dbf07d56c152a36860cb78b14cde67f5.png)\n\n\u003c/div\u003e\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\n# Extract Image Features\n\nI'll be using a computer vision algorithm known as [ORB (Oriented FAST\nand Rotated BRIEF)](https://doi.org/10.1109/ICCV.2011.6126544) to\nextract keypoints and feature descriptor vectors from the images.\n\nKeypoints and feature vectors are extremely useful for object detection,\nrecognition, and tracking tasks. It should hopefully allow us to\npinpoint the position of objects as we're driving down the road, by\ntracking the location of the object through each frame. The distance\nthat each keypoint moves for each frame should give us a good idea of\nhow fast we're going.\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell code\" data-execution_count=\"5\" data-tags=\"[]\"\u003e\n\n``` python\n# Initialize OpenCV's implementation of ORB\nORB = cv2.ORB_create()\n\n# Try to create the output folder if it doesn't exist.\nmkdir('data/trainfeatures/')\nmkdir('data/testfeatures/')\n\ndef extract_features_and_keypoints(in_fname, out_fname):\n    # Open the image\n    img = cv2.imread(in_fname, 1)\n\n    # Compute the keypoints and descriptor vectors\n    kps, des = ORB.detectAndCompute(img, None)\n\n    # Handle cases where no keypoints were found\n    des = [] if not len(kps) else des\n    if not len(des):\n        print(f'{in_fname} has no keypoints.')\n        \n\n    # Convert to a nicer format to store\n    kps = [(point.pt, point.size, point.angle, point.response, point.octave, point.class_id, desc) for point, desc in zip(kps, des)]\n\n    # Save the results\n    with open(out_fname, 'wb') as f:\n        pickle.dump(kps, f)\n\n# Extract the features and keypoints for all our images\nif extract:\n    for i in range(num_trainframes):\n        extract_features_and_keypoints(f'data/trainframes/{i}_left.png', f'data/trainfeatures/{i}_left.kpv')\n        extract_features_and_keypoints(f'data/trainframes/{i}_right.png', f'data/trainfeatures/{i}_right.kpv')\n    for i in range(num_testframes):\n        extract_features_and_keypoints(f'data/testframes/{i}_left.png', f'data/testfeatures/{i}_left.kpv')\n        extract_features_and_keypoints(f'data/testframes/{i}_right.png', f'data/testfeatures/{i}_right.kpv')\n```\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell code\" data-execution_count=\"6\"\u003e\n\n``` python\ndef load_features_and_keypoints(fname):\n    with open(fname, 'rb') as f:\n        loaded = pickle.load(f)\n        kps = [cv2.KeyPoint(x=point[0][0],y=point[0][1],_size=point[1], _angle=point[2], _response=point[3], _octave=point[4], _class_id=point[5]) for point in loaded]\n        des = np.array([point[6] for point in loaded])\n        return kps, des\n\n\n# Load the features into memory\ntrain_left_features  = [ load_features_and_keypoints(f'data/trainfeatures/{i}_left.kpv')  for i in range(num_trainframes) ]\ntrain_right_features = [ load_features_and_keypoints(f'data/trainfeatures/{i}_right.kpv') for i in range(num_trainframes) ]\ntest_left_features   = [ load_features_and_keypoints(f'data/testfeatures/{i}_left.kpv')   for i in range(num_testframes)  ]\ntest_right_features  = [ load_features_and_keypoints(f'data/testfeatures/{i}_right.kpv')  for i in range(num_testframes)  ]\n\n# Hold back validation set\nvalid_features = [train_right_features[i] for i in range(len(train_right_features)//2, len(train_right_features))]\n\n\n# Display the first frame split with its keypoints.\n#img = cv2.imread()\n#img = cv2.drawKeypoints(img, kp, None)\n#plt.imshow()\n#plt.axis('off')\n#plt.show()\n```\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell code\" data-execution_count=\"7\"\u003e\n\n``` python\ndef showCroppedSlicedFeatureExtractedFrame(i):\n    # Load image and change color space\n    left_img  = cv2.imread(f'data/trainframes/{i}_left.png', 1)\n    right_img = cv2.imread(f'data/trainframes/{i}_right.png', 1)\n\n    left_img  = cv2.cvtColor(left_img, cv2.COLOR_BGR2RGB)\n    right_img = cv2.cvtColor(right_img, cv2.COLOR_BGR2RGB)\n\n    # Load keypoints and descriptors\n    left_kp, left_des   = train_left_features[i]\n    right_kp, right_des = train_right_features[i]\n\n    # Paste the keypoints onto the image\n    left_img  = cv2.drawKeypoints(left_img, left_kp, None)\n    right_img = cv2.drawKeypoints(right_img, right_kp, None)\n\n    # Display the image\n    plt.subplot(1, 2, 1)\n    plt.imshow(left_img);\n    plt.axis('off')\n    plt.subplot(1, 2, 2)\n    plt.imshow(right_img);\n    plt.axis('off')\n    plt.show()\n\nshowCroppedSlicedFeatureExtractedFrame(0)\nshowCroppedSlicedFeatureExtractedFrame(1)\nshowCroppedSlicedFeatureExtractedFrame(2)\n```\n\n\u003cdiv class=\"output display_data\"\u003e\n\n![](/tmp/pandoc/33279ea8706bc8ebd75437f23a38b27bdd8e3da6.png)\n\n\u003c/div\u003e\n\n\u003cdiv class=\"output display_data\"\u003e\n\n![](/tmp/pandoc/2238c2746f085d5d662447c89d849e2cd447c7d9.png)\n\n\u003c/div\u003e\n\n\u003cdiv class=\"output display_data\"\u003e\n\n![](/tmp/pandoc/cd2519ec145e31f129f0388b12050b0efac6bc8c.png)\n\n\u003c/div\u003e\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\nAs you can see, not all the same keypoints are detected in every frame.\nThis is fine however. The descriptor vector for each keypoint keeps\ntrack of the relevant contextual information.\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\n## Find Frames that are missing keypoints\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell code\" data-execution_count=\"8\"\u003e\n\n``` python\ndef find_missing(features):\n    return [i for i in range(len(features)) if not len(features[i][1])]\n\nmtrainleft  = find_missing(train_left_features)\nmtrainright = find_missing(train_right_features)\nmtestleft   = find_missing(test_left_features)\nmtestright  = find_missing(test_right_features)\n\nprint(f'Missing ({len(mtrainleft)}, {len(mtrainright)}) frames of features from each side of the training data.')\nprint(f'Missing ({len(mtestleft)},  {len(mtestright)}) frames of features from each side of the testing data.')\nprint()\nprint(f'Missing left train features:  ', mtrainleft)\nprint(f'Missing right train features: ', mtrainright)\nprint(f'Missing left test features:   ', mtestleft)\nprint(f'Missing right test features:  ', mtestright)\nprint()\nprint('Missing from both left and right train: ', [i for i in mtrainleft if i in mtrainright])\nprint('Missing from both left and right test:  ', [i for i in mtestleft if i in mtestright])\n```\n\n\u003cdiv class=\"output stream stdout\"\u003e\n\n    Missing (52, 43) frames of features from each side of the training data.\n    Missing (7,  3) frames of features from each side of the testing data.\n    \n    Missing left train features:   [14396, 14397, 14408, 14410, 14411, 14428, 14429, 14452, 14453, 14456, 14457, 14458, 14459, 14710, 14714, 14715, 14716, 14735, 14736, 14737, 14738, 14739, 14740, 14741, 14742, 14743, 14744, 14745, 14747, 14748, 14752, 14753, 14754, 14755, 14756, 14757, 14758, 14759, 14760, 14761, 14762, 14763, 14764, 14765, 14766, 14767, 14768, 14769, 14770, 14772, 14773, 14774]\n    Missing right train features:  [0, 11934, 11937, 11938, 12340, 12361, 12363, 12364, 12365, 12366, 12367, 12368, 12369, 12370, 12371, 12372, 12373, 12374, 12375, 12376, 12378, 12379, 12380, 12382, 12383, 12384, 12385, 12386, 12387, 12388, 12389, 12390, 12392, 12394, 17220, 17221, 17222, 17223, 17224, 17225, 17226, 17227, 17229]\n    Missing left test features:    [188, 712, 713, 714, 715, 716, 718]\n    Missing right test features:   [4546, 4547, 4549]\n    \n    Missing from both left and right train:  []\n    Missing from both left and right test:   []\n\n\u003c/div\u003e\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\nYou'll notice here that ORB failed to extract keypoints and descriptor\nvectors from some of the images. However, note that we still have at\nleast one keypoint for each frame. If the right side of the frame\ndoesn't have any features, the left does.\n\nWe always have at least one keypoint to work with.\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\n## Slice up the usable parts of each video\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\n## Generate training examples\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell markdown\"\u003e\n\n### Load labels\n\n\u003c/div\u003e\n\n\u003cdiv class=\"cell code\" data-execution_count=\"9\"\u003e\n\n``` python\n# Load the label file\nlabels = []\nwith open('data/train.txt') as f:\n    labels = f.readlines()\nlabels = [ float(label.strip()) for label in labels ]\n\nprint(f'min: {min(labels)}, max: {max(labels)}, avg: {sum(labels)/len(labels)}')\n```\n\n\u003cdiv class=\"output stream stdout\"\u003e\n\n    min: 0.0, max: 28.130404, avg: 12.18318166044118\n\n\u003c/div\u003e\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapaz-cli%2Fspeed-challenge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapaz-cli%2Fspeed-challenge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapaz-cli%2Fspeed-challenge/lists"}