Human RNA MaP

How to predict by coordinate

Input the genomic coordinates of your region of interest, using hg38 as the reference genome. This input is the most flexible, and is unconstrained by sequencing coverage or any annotation. Be sure to double-check that the output sequence matches your expected region, and check that the output coverage of A/C bases is above the recommended 70%, which is reflective of DMS-modifiable bases meeting all coverage and quality filters.

For most users, only one set of coordinates will be used to define their region of interest. However, two sets of coordinates may be needed when joining two separate regions together, such as when crossing a splice junction.

When analyzing a region within a larger transcript, it is generally recommended to test “buffer” regions, where the region of interest should be extended by 20-50 nt on each end in order to reduce the likelihood of structures being arbitrarily interrupted by the region borders.

Due to computational and server constraints, users will be limited to a maximum input length of 500 nucleotides and a maximum output of 5 predicted structures, though some regions may yield fewer than that. If this does not suit your needs, consider downloading and running the data and code locally (see Download ).

Predict by Gene

Advanced Options

How was this calculated?

SUGP2
ENST00000337018 (-) ENSG00000064607 (-)

Custom Window:

Start

Size

100

150

300

500

Strand

+

-

Predict

	Region	Local Coords	Chrom Coords	Length	R	N	Gini	Action
1	5'UTR,CDS	126-204	19031002-19031080	79	0.3216	50	0.29	View
2	CDS	208-306	19026176-19026226 / 19030951-19030998	99	0.5727	50	0.30	View
3	CDS	307-405	19026077-19026175	99	0.7973	50	0.41	View
4	CDS	407-506	19025976-19026075	100	0.6397	50	0.28	View
5	CDS	507-616	19025866-19025975	110	0.7304	50	0.38	View
6	CDS	619-757	19025725-19025863	139	0.5471	50	0.35	View
7	CDS	758-862	19025620-19025724	105	0.6738	50	0.31	View
8	CDS	863-940	19025542-19025619	78	0.7338	50	0.33	View
9	CDS	941-1016	19025466-19025541	76	0.5669	50	0.41	View
10	CDS	1017-1124	19025358-19025465	108	0.4895	50	0.28	View
11	CDS	1126-1210	19025272-19025356	85	0.3917	50	0.28	View
12	CDS	1211-1304	19025178-19025271	94	0.6019	50	0.36	View
13	CDS	1306-1404	19025078-19025176	99	0.3281	50	0.29	View
14	CDS	1405-1503	19024979-19025077	99	0.4567	50	0.28	View
15	CDS	1504-1592	19024890-19024978	89	0.7322	50	0.38	View
16	CDS	1597-1691	19024791-19024885	95	0.5606	50	0.38	View
17	CDS	1693-1785	19024697-19024789	93	0.0702	50	0.31	View
18	CDS	1786-1877	19019216-19019229 / 19024619-19024696	92	0.5838	50	0.38	View
19	CDS	1879-1967	19019126-19019214	89	0.4209	50	0.30	View
20	CDS	1969-2077	19010250-19010342 / 19019109-19019124	109	0.4081	50	0.26	View
21	CDS	2078-2166	19010161-19010249	89	0.5319	50	0.36	View
22	CDS	2167-2267	19010060-19010160	101	0.4481	50	0.40	View
23	CDS	2268-2363	19009964-19010059	96	0.4868	50	0.39	View
24	CDS	2364-2442	19009885-19009963	79	0.4227	50	0.34	View
25	CDS	2443-2540	19008361-19008428 / 19009855-19009884	98	0.2290	50	0.30	View
26	CDS	2542-2633	19004598-19004646 / 19008317-19008359	92	0.4142	50	0.36	View
27	CDS	2641-2735	19004496-19004590	95	0.9544	50	0.42	View
28	CDS	2736-2851	19004380-19004495	116	0.3952	50	0.37	View
29	CDS	2852-2948	19004283-19004379	97	0.6371	50	0.45	View
30	CDS	2953-3048	19004183-19004278	96	0.5530	50	0.33	View
31	CDS	3050-3131	18995275-18995280 / 19001613-19001674 / 19004168-19004181	82	0.5747	50	0.31	View
32	CDS	3132-3231	18995175-18995274	100	0.6732	50	0.38	View
33	CDS	3233-3338	18994411-18994486 / 18995144-18995173	106	0.8025	50	0.39	View
34	CDS,3'UTR	3339-3422	18991111-18991149 / 18994366-18994410	84	0.3986	50	0.31	View
35	3'UTR	3426-3606	18990927-18991107	181	0.4630	70	0.32	View
Download CSV

Custom Gene Windows

Gene Window Instructions/Interpretation

Search for a gene by name. For each gene, only the canonical transcript (as defined by UCSC in GRCh38.106) is shown. Some gene names may correspond to multiple genomic locations.
For the selected gene, a scatterplot showing the Gini indices, where a high Gini index corresponds to a highly-structured region, for small windows within the transcript. For small RNAs, these windows are sized 20 valid (coverage- and quality-filtered) A/C data points; for all other transcripts, these windows are sized 50 valid A/C data points, due to the ability of DMS to modify primarily A/C bases. In regions with sufficient coverage, these windows correspond to actual transcript lengths of roughly 40 and 100 nucleotides, respectively.
Select a window of interest, either from the scatter plot or from the table below. Generally, windows with a high Gini will yield better results, where the DMS signal aligns more accurately with the predicted base-pais. Windows with high Gini indices relative to the rest of the transcript can also be used to find functional structural elements (ex: TFRC iron response elements).
If previously-defined windows do not suit your needs, you can define a custom window either using (1) the slider below the scatterplot, which shows both a heatmap of previously-defined windows as well as the locations of each UTR and CDS; or (2) custom coordinate-based entry via Predict by Coordinates.

See About for more information on the dataset and for best practices in structure determination.

completed

35
SUGP2

Aoi:	Chr:	Coords	Strand:
SUGP2	19	18990927 - 18991107	-

Download Data & Images

Annotation Based on canonical annotations, the following gene is in your area of interest: SUGP2(-)
Length Length of region: 181 nt.
Coverage 100.0% of the region's A/C bases included in the filtered DMS dataset. Ideally, this number should be as high as possible for an experimenally-accurate prediction, and over 70% is recommended.
Structures This region has 9 maximum predicted structures. 5 is the maximum structures visible on this site. For up to 20 predictions per region, download and run the code locally.

Predict by Gene

SUGP2 ENST00000337018 (-) ENSG00000064607 (-)

Custom Gene Windows

completed 35 SUGP2

SUGP2
ENST00000337018 (-) ENSG00000064607 (-)

completed

35
SUGP2