Media Forensics
PAR Government’s leading edge research in Media Forensics contributes to deployable solutions that identify altered photos, videos, and other imagery. Critical to harnessing the power of machine learning and artificial intelligence are high quality and mission-specific training and validation data. PAR is advancing both the art and the science of media forensics by creating one of the world’s largest curated, high provenance data corpuses – and deploying it to accelerate our understanding not only the threat, but our ability to counter.
We establish confidence in AI media analysis through evaluations designed with real-world threat models in mind.
- We are responsible for designing evaluations for bleeding-edge media forensic techniques, grounded in a real-world threat landscape.
- We designed and executed over 70 evaluations, using 65+ datasets, with 6 teams participating, and over 200 techniques submitted.
We curate and create evaluation datasets tailored to custom scenarios to determine effectiveness of forensic techniques.
- In DARPA’s MediFor and SemaFor Programs, we’ve produced petabytes of data used in evaluation of bleeding-edge forensic techniques.
- Our work includes high-provenance data collection, manual and automated manipulation techniques for text, image, audio and video, DeepFakes, as well as fully synthetic generation by many media generation models.
We leverage world-class expertise to deliver State-Of-The-Art Forensic tools in accessible, flexible applications.
- PAR’s Media Forensics team has spent over 5 years creating purpose-built evaluation datasets and systems that establish trustworthy AI media analysis.
Evaluation Design
- We build evaluation tasks designed around real-world scenarios and in-the-wild problems.
- We design scoring strategies and choose metrics that ensure consistent performance across target dimensions.
- When target scenario data is not available, we design proxy datasets and tasks to leverage analogous data to solve sensitive problems.
Data Production - Manipulation
-
Our data team can produce manipulated media to simulate threats of concern, fueling evaluation tasks that ensure that detection capabilities are up to the challenge.
- DeepFakes and Puppeteering
- Photoshop and Image splices
- Synthetic images
- Text - synthetic, human-generated, or a mix of the two.
- Audio - synthetically generated or voice clones
Media Forensics
Identify falsified imagery and videos
AI Training Data
Optimize your Convolutional Neural Networks (CNNs) model with mission representative data
Camera Forensics
Identify specific device used to collect imagery