-
Building CLIP from Scratch to Classify Pokemon
Learning about how Contrastive Language-Image Pre-training (CLIP) works
-
TRTorch: A compiler for TorchScript Targeting NVIDIA GPUs
Writing a compiler for PyTorch that optimizes neural nets for NVIDIA GPUs
-
Real Time Autonomous Vehicle Perception on Heterogenous Hardware
Showing how you can use DRIVE AGX and its suite of accelerators to achieve real-time perception for self driving cars
-
TensorRT Python API
Making NVIDIA's Neural Network Optimizer accessible to the everyday data scientist
-
Peter Pan: Semantic Image Compression
A method for compressing images, prioritizing semantically important data
-
3DFS: Deformable Dense Depth Fusion and Segmentation for Object Reconstruction from a Handheld Camera
We propose an approach for 3D reconstruction and segmentation of a single object placed on a flat surface from an input video. Our approach is to perform dense depth map estimation for multiple views using a proposed objective function that preserves detail. The resulting depth maps are then fused using a proposed implicit surface function that is robust to estimation error, producing a smooth surface reconstruction of the entire scene. Finally, the object is segmented from the remaining scene using a proposed 2D-3D segmentation that incorporates image and depth cues with priors and regularization over the 3D volume and 2D segmentations. We evaluate 3D reconstructions qualitatively on our Object-Videos dataset, comparing to fusion, multiview stereo, and segmentation baselines. We also quantitatively evaluate the dense depth estimation using the RGBD Scenes V2 dataset [Henry et al. 2013] and the segmentation using keyframe annotations of the Object-Videos dataset.