BioTransfer / Riboseq ATLAS · Hidden Proteins in Cancer
GEO: GSE143263 ← Other tools

Riboseq ATLAS

First public query tool nuORFdb v1.0

Every gene you know can also produce unexpected, shorter proteins from hidden reading frames. This tool lets you search which of these are actively translated in cancer — discovered by ribosome profiling (Ribo-seq), which captures ribosomes in the act of building proteins.

323,848 ORFs  ·  5 cancer / healthy cohorts  ·  No other public query interface exists for this dataset  ·  Ouspenskaia et al., Nat Biotechnol 2022  ·  GEO: GSE143263

Total ORFs
Novel (nuORF)
Canonical
+ Sense strand
− Antisense strand
ORF Types
What is Riboseq ATLAS? What can I find here? ▼

The core idea: Your genome encodes far more proteins than textbooks list. Beyond the ~20,000 known genes, cells — especially cancer cells — secretly translate thousands of additional short proteins from regions normally labelled "non-coding": upstream of genes, inside long non-coding RNAs, in alternative reading frames, and from pseudogenes.

How were they found? Ribosome profiling (Ribo-seq) freezes ribosomes mid-translation and sequences the mRNA they are reading. This directly proves a stretch of RNA is being turned into protein — not just transcribed. The authors applied this to cancer cell lines and patient tumours, then built nuORFdb: a catalogue of 323,848 actively translated ORFs.

Why does it matter? These hidden proteins can appear on the surface of cancer cells as neoantigens — targets the immune system (or a therapy) could recognise. Some are cancer-specific: present in tumours but absent in healthy tissue. This tool is the only public interface to query, filter, and download this dataset interactively.

What you can do here
Search by gene

Find all translated ORFs associated with any gene (e.g. TP53, EGFR, BRCA1)

Filter by ORF type

Explore 5′ uORFs, lncRNA ORFs, out-of-frame, pseudogene-derived, and more

Check cancer expression

See RNA-seq expression in GBM, melanoma, CLL vs healthy tissues with boxplots

Verify translation

Ribo-seq TPM across 41 cell lines confirms the ORF is actively making protein

Database

nuORFdb v1.0 — 323,848 ORFs

86,421 canonical (annotated)

237,427 novel unannotated (nuORF)

Source: ribosome profiling (Ribo-seq)

ORF Types

5′ uORF / 5′ Overlap uORF

3′ dORF / 3′ Overlap dORF

lncRNA-embedded

Out-of-Frame (alt reading frame)

Pseudogene-derived

Canonical (annotated CDS)

Ribo-seq Cell Lines (41)

Cancer lines:

A375 (melanoma)

HCT116 (colon carcinoma)

B721.221 (92 HLA mono-allelic)

+ primary melanoma / GBM / CLL tumours

Healthy:

Primary melanocytes (Hmel rep1–3)

What Ribo-seq measures

Active translation (ribosomes on mRNA). High TPM = actively translated in that cell line.

RNA-seq Expression Groups (5)
CLL — Chronic Lymphocytic Leukaemia
390 patient samples
Healthy B Cells
21 healthy donor samples
GBM — Glioblastoma Multiforme (TCGA)
172 tumour samples
SKCM — Skin Cutaneous Melanoma (TCGA)
473 tumour samples
GTEx Healthy Tissues
777 samples across 31 tissue types
Citation

Ouspenskaia et al., Nat Biotechnol 2022.
GEO: GSE143263

Search Database

Results

Run a search above
Riboseq ATLAS — 323,848 actively translated ORFs (nuORFdb v1.0) including lncRNA-derived, 5′ uORF, Out-of-Frame, Pseudogene, and antisense sources. Click any ORF ID for full details, Ribo-seq TPM across 41 cell lines, and expression boxplots across 5 cancer & healthy cohorts.