Ask My DNA Blog

11 min read
2,278 words

You paid for a 23andMe kit, got your results β€” and now you have access to something far more powerful than the polished health reports on the dashboard: your raw genetic data. This file contains roughly 600,000–700,000 single nucleotide polymorphisms (SNPs) with your actual genotype calls. Most people never open it. That's a mistake.

This guide walks you through downloading your raw data, understanding what's inside, and using it to extract actionable insights β€” from pharmacogenomics to nutrition to long-term health risks. We'll also cover what 23andMe's raw data cannot tell you, and how third-party tools like Ask My DNA can help you interpret it through an AI-powered conversation.

Note on 23andMe's 2025 situation: Following 23andMe's bankruptcy filing in early 2025, downloading your raw data is more important than ever. Once a company undergoes acquisition or liquidation, your data may be transferred or deleted. Download it now, store it securely, and own your own genomic information.


What's Actually Inside Your 23andMe Raw Data File

When you download your raw data, you get a plain-text file (.txt) that looks deceptively simple. Here's what it contains:

  • Header lines (starting with #) β€” metadata including the chip version, build reference (GRCh37/hg19), and date
  • Data rows β€” one per SNP, with four columns:
    • rsid β€” the reference SNP ID (e.g., rs429358)
    • chromosome β€” where the variant sits (1–22, X, Y, MT)
    • position β€” genomic coordinate
    • genotype β€” your two alleles (e.g., CT, AA, -- for no call)

The file is roughly 15–25 MB uncompressed. With ~650,000 SNPs, you have a substantial slice of your genome β€” but not all of it. 23andMe uses a genotyping array (currently Illumina's Global Screening Array, v5), which interrogates pre-selected positions rather than sequencing the entire genome. This matters: variants not on the array simply don't appear in your file.

Key SNPs you'll find in the file include clinically relevant positions like rs429358 and rs7412 (APOE), rs1801133 (MTHFR C677T), rs9939609 (FTO), and thousands of pharmacogenomic markers.


How to Download Your 23andMe Raw Data (Step by Step)

  1. Log in at 23andme.com
  2. Click your name (top right) β†’ Settings
  3. Scroll to 23andMe Data β†’ click View
  4. Select Download Raw Data
  5. Choose which profile (if you manage multiple kits)
  6. Complete identity verification (password + 2FA)
  7. Click Download

You'll receive a .zip archive. Inside is a single .txt file. Store it somewhere safe β€” an encrypted folder, a password manager's secure note storage, or an offline drive. Do not upload it to random websites without reading their privacy policy.


Pharmacogenomics: How Your Genes Affect Medication Response

This is one of the highest-value use cases for raw data analysis. Pharmacogenomics (PGx) studies how genetic variants change the way your body metabolizes, responds to, or is harmed by specific drugs.

Key genes covered (at least partially) in 23andMe raw data:

CYP2D6 β€” metabolizes ~25% of all prescription drugs, including codeine, tamoxifen, many antidepressants (fluoxetine, paroxetine), and antipsychotics. A "poor metabolizer" genotype means standard doses can accumulate to toxic levels.

CYP2C19 β€” critical for clopidogrel (Plavix), omeprazole, escitalopram. The rs4244285 variant (CYP2C19*2) creates a non-functional enzyme. Poor metabolizers on clopidogrel have significantly higher cardiovascular event rates.

CYP2C9 β€” warfarin dosing. Variants rs1799853 (*2) and rs1057910 (*3) reduce enzyme activity, meaning standard warfarin doses can cause bleeding.

VKORC1 β€” also affects warfarin sensitivity. rs9923231 determines how sensitive your vitamin K cycle is to warfarin inhibition.

TPMT / NUDT15 β€” thiopurine drugs (azathioprine, mercaptopurine). Variants here can cause life-threatening myelosuppression at standard doses.

Important caveat: 23andMe's array does not capture all PGx variants, particularly rare copy number variations in CYP2D6 that are critical for accurate phenotype assignment. For clinical decisions, a dedicated PGx panel from a lab like GeneSight or a medical genetics consultation is the appropriate next step. Use raw data analysis as a starting point, not a clinical report.


Nutrition and Metabolism Genetics

Your raw data contains several SNPs with practical dietary implications:

MTHFR (rs1801133, rs1801131) β€” methylfolate conversion. The C677T variant (TT homozygote) reduces MTHFR enzyme activity by ~70%, impairing folate metabolism and homocysteine clearance. Relevant for folate form supplementation decisions.

FTO (rs9939609) β€” associated with fat mass and obesity. The AA genotype carries ~1.67x increased obesity risk vs. TT. More relevant as a modifier of dietary fat and satiety response than a deterministic predictor.

LCT (rs4988235) β€” lactase persistence. This single SNP is highly predictive: GG genotype = lactase non-persistence (lactose intolerance in most populations). AG/AA = lactase persistence.

FADS1/FADS2 β€” fatty acid desaturase genes. Variants affect conversion efficiency of ALA to EPA/DHA, with direct implications for omega-3 supplementation needs.

APOA2 (rs5082) β€” saturated fat interaction. TT genotype is associated with greater BMI increase in response to high saturated fat intake.

VDR (rs2228570, rs1544410) β€” vitamin D receptor variants affecting vitamin D binding efficiency and downstream gene expression.


Health Risk Variants Worth Knowing

These are the variants where knowing your status has clear preventive implications:

APOE (rs429358 + rs7412) β€” the most clinically significant finding for most people. The combination of these two SNPs determines your APOE genotype (e2/e2 through e4/e4). APOE4 carriers have 3–15x increased Alzheimer's risk depending on copy number, and altered cardiovascular lipid metabolism. This is information 23andMe does report in their Health+Ancestry plan, but it's also directly readable from raw data.

BRCA1/BRCA2 β€” 23andMe tests exactly three BRCA variants (not a comprehensive panel). A negative result does not rule out BRCA pathogenic variants. If you have family history of breast or ovarian cancer, a clinical BRCA panel through a genetic counselor covers hundreds of variants.

MUTYH (rs34612342, rs36053993) β€” biallelic mutations cause MUTYH-associated polyposis, a colorectal cancer risk syndrome.

HFE (rs1800562, rs1799945) β€” hereditary hemochromatosis. The C282Y homozygote (GG at rs1800562) is the primary high-penetrance genotype.


Trait Genetics: Athletic Performance and Other Curiosities

Not all raw data analysis has to be medically serious:

ACTN3 (rs1815739) β€” the "speed vs. endurance" gene. The RR genotype (CC) preserves alpha-actinin-3 in fast-twitch fibers, associated with sprint/power performance. The XX genotype (TT) β€” found in ~18% of Europeans β€” means no alpha-actinin-3 expression, associated with better endurance efficiency.

CYP1A2 (rs762551) β€” caffeine metabolism speed. AA = fast metabolizer (caffeine cleared quickly, lower cardiovascular risk from coffee consumption). AC/CC = slow metabolizer (caffeine persists longer, associated with higher MI risk at high intake).

CLOCK (rs1801260) β€” circadian preference. Associated with morningness-eveningness (chronotype).

SLC45A2, SLC24A5, OCA2/HERC2 β€” pigmentation variants affecting eye color, hair color, and skin tone.


Tools for Analyzing Your 23andMe Raw Data

Once you have the file, you have several options:

Ask My DNA β€” upload your 23andMe raw data and interact with an AI that has direct access to your SNPs. Instead of reading static reports, you ask specific questions: "What do my CYP2D6 variants mean for antidepressant dosing?" or "Do I have the APOE4 variant?" The AI queries your actual genotype data and gives you a personalized, conversational answer. It covers pharmacogenomics, nutrition, fitness, ancestry-informative markers, and health risks β€” without requiring you to know rsIDs in advance.

Promethease β€” a long-established tool that cross-references your raw data against SNPedia, a wiki of genotype-phenotype associations. Generates a comprehensive report for a small fee (~$12). High information density, but requires patience to navigate and some background knowledge to interpret.

Genetic Genie β€” free tool focused specifically on methylation (MTHFR, COMT, MTR, MTRR, CBS pathways) and detox genes (CYP1B1, GSTP1, etc.). Useful for a narrow but clinically popular analysis focus.

Self Decode β€” subscription-based platform with curated health reports from raw data. More consumer-friendly interface than Promethease.

What to avoid: Any service asking you to upload your raw data without a clear privacy policy, data retention terms, or encryption disclosure. Your genome is permanent β€” you cannot change it if it's misused.


Limitations of 23andMe Raw Data You Should Understand

Array coverage gaps: Genotyping arrays test pre-selected positions. Rare variants, structural variants (insertions, deletions, copy number variations), and positions not on the Illumina chip are invisible in your file.

Imputation vs. direct genotyping: Some published raw data files include imputed genotypes β€” statistically inferred positions that weren't directly measured. These carry lower confidence and may not be flagged clearly.

No clinical validation: 23andMe's reports are FDA-authorized for specific conditions under the DTC (direct-to-consumer) framework, which has a different standard than clinical-grade laboratory testing. A positive finding for a serious condition should always be confirmed through a CLIA-certified lab.

No phasing information: The raw data tells you what alleles you have but not which chromosome each allele is on. This matters for compound heterozygosity (e.g., two CFTR mutations on different chromosomes vs. the same chromosome) β€” something the raw file alone cannot resolve.

Chip version changes: Different 23andMe chip versions (v3, v4, v5) cover different SNP sets. Some older variants that researchers cite may be absent from newer kits.


Data Privacy and Security

Store your raw data file with the same care you'd give a passport scan:

  • Encryption at rest: Use VeraCrypt, macOS FileVault, or a hardware-encrypted drive
  • Upload only to trusted services: Read privacy policies for data retention, third-party sharing, and deletion rights
  • GDPR rights (EU residents): Any service operating with EU user data must honor deletion requests under Article 17
  • 23andMe bankruptcy (2025): Under US bankruptcy law, user data can be considered an asset in acquisition deals. Download your data now. 23andMe's privacy policy technically prohibits sale of genetic data to insurance or pharma without consent, but the operative word is currently. Controlling your own copy removes that dependency.

Services like Ask My DNA process your file for query purposes and do not sell or share your raw genetic data with third parties.


Frequently Asked Questions

Can I use my 23andMe raw data for medical decisions? Raw data analysis is informational, not diagnostic. It can help you ask better questions of your doctor and identify variants worth discussing β€” but clinical decisions require confirmation through CLIA-certified testing and interpretation by a qualified healthcare provider. The Genetics in Medicine literature documents false-positive rates in DTC testing that make independent confirmation important for any actionable finding.

What's the difference between 23andMe's health reports and raw data analysis? 23andMe's in-app health reports cover a curated set of conditions with FDA authorization and consumer-friendly explanations. Your raw data contains all ~650,000 SNPs the chip measured β€” including thousands of variants 23andMe doesn't report on. Third-party tools access this full dataset, enabling analysis of pharmacogenomic variants, nutritional SNPs, and research-grade associations that 23andMe's platform doesn't surface.

Is it safe to upload my 23andMe file to Ask My DNA? Ask My DNA uses your file to answer your questions in the chat interface. Your raw data is not shared with third parties or used to train models without consent. The service uses encrypted transmission and storage. As with any genomic data service, review the privacy policy and data retention terms before uploading.

How accurate are third-party raw data analyses? Accuracy depends on what's being analyzed. Well-established, high-penetrance variants (LCT rs4988235, APOE rs429358/rs7412, HFE C282Y) have high genotyping accuracy and strong evidence bases. Polygenic risk scores and trait predictions are probabilistic and population-level β€” they describe statistical tendencies, not individual certainties. The further a claim is from a simple Mendelian variant, the more skepticism is warranted.

What if 23andMe deletes my data during bankruptcy proceedings? Download your raw data immediately from Account Settings β†’ 23andMe Data β†’ Download Raw Data. You own the right to your own genomic information. Once downloaded, your local copy is independent of what happens to 23andMe's servers or any future acquirer's data policies.


References

  1. 1.
    . 23andMe Support Documentation. .
  2. 3.
    . 2023.
  3. 4.
    . 2024.

All references are from peer-reviewed journals, government health agencies, and authoritative medical databases.

Available Now

Stop reading about genetics. Start understanding yours.

Upload your DNA file and ask any question about your personal genome. Get answers in seconds, not weeks.

How it works

1

Upload your DNA file

Drag your raw file from 23andMe, Ancestry, or other services. Takes less than 2 minutes.

2

Ask any question

"Why does coffee affect me this way?" "What vitamins do I need?" "Am I a carrier?"

3

Get personalized answers

Answers based on YOUR genes, not population statistics. With scientific references.

Works with:

23andMeAncestryMyHeritageFTDNA
🧬

Ready to get started?

Discover what your DNA says about you. Personalized answers based on your unique genome.

Get started now

Encrypted Β· Never shared Β· GDPR compliant

We use consent-based analytics

Marketing pixels (Meta, Google, LinkedIn, TikTok, Twitter) only activate after you accept. Declining keeps the site fully functional without tracking. Learn more

What to Do With 23andMe Raw Data: Complete Analysis Guide (2026)