Information

Research from the early molecular genetics era that supported protein as the primary carrier of genetic information?

Research from the early molecular genetics era that supported protein as the primary carrier of genetic information?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I can't seem to find anything on my own. Surely there were experiments performed (possibly using bacteriophages) that managed to come to this conclusion?

Specifically I'm interested in anything pre-Hershey/Chase.


It's difficult to be convincing with a negative answer, but I venture that the answer to your question is that the idea that protein was the genetic material was not supported by any experiments, but was a supposition based on the conventional wisdom that proteins were more complex and therefore more suited to the job.

I suggest that you should be thinking earlier than bacteriophage and Hershey and Chase. Bacteriophages only came into genetic research after the war, and the Hershey/Chase experiment (1952) was only a rigorous confirmation of the conclusion of Avery et al. in 1944 (mentioned in this Scitable article). The system that Avery et al. used (bacterial transformation) went back to Griffiths in 1928 and must have been the first that allowed any sort of experimentation in this area. And I think it was more a question of working out what was happening than devising an experiment to distinguish between protein and DNA as the genetic material.

To my mind this illustrates something about how science progresses. It is often dependent on advances in technology and changes in the conceptual framework in which scientists work. (Sometimes - e.g. Mitchell's Hypothesis - it needs the experimental results to change that conceptual framework.) Coming back to the Hershey/Chase experiment, this was performed when the balance of scientific thought had already changed to DNA as the genetic material.

Footnote: It's earlier than you think

I generated the Google ngram below to put the term you use, 'molecular genetics' into historical perspective.

This shows that the term 'molecular genetics' was hardly in use at the time of the Hershey-Chase experiment and Watson and Crick's postulate of the structure of DNA in the early 1950s. I think that this reflects the fact that the concept was not current. The more popular term, 'molecular biology' gained currency on the basis of the structural work on DNA and proteins, and the name of the journal in which much of the work was published (Journal of Molecular Biology). The question is assuming a state of science that did not actually exist before the 1950s.


I am relying heavily on Morange's book (quoted below) for what follows.

As pointed out by David, the definitive experiment showing that nucleic acid, rather than protein, is the carrier of genetic information is that of Avery, MacLeod and McCarthy (1944). Their conclusion that DNA is the 'transforming principle' was, however, very reluctantly accepted by the scientific community, in particular by the biochemists (see Cobb, 2014), and it was not until the famous 'Waring blender' experiment of Hershey and Chase (1953), and beyond, that it was accepted that DNA is the hereditary material.

So, to attempt to answer your question, what was the evidence for proteins (in particular enzymes) being the hereditary material?

  1. It was known that proteins are one of the two components of chromosomes (Morange, p 35)

  2. The tetranucleotide hypothesis of Levene, published about 1910, held that DNA was made of a monotonic, tetranucleotide repeating unit, and this gave rise to the idea that DNA had only a non-specific structural rule in the chromosome (Morange, p 34).

  3. By the 1930's, several enzymes (including urease) had been crystallized by Northrop and Sumner and it was unequivocally established that enzymes are proteins and are macromolecular, thus ushering in the great era of enzymology. Compared to the dynamic world of enzymology, DNA was considered boring.

  4. In 1935, Wendell M. Stanley had isolated TMV virus as "a crystalline protein possessing the properties of Tobacco Mosaic Virus" and had published the results in Science. Crystallization (wrongly) implied purity and it probably understandable that such a high profile result lent weight to hypothesis that genes and proteins are intimately connected. This result was criticized, however. Bawden & Pirie (1938) in the UK repeated the experiment but (unlike Stanley) found 6% RNA in the final product (Morange, p 65).

  5. Beedle and Tatum (1941) had published the 'one gene one enzyme' hypothesis and (to again quote Morange) [p 35] this "had reinforced the more or less conscious identification of genes with enzymes and proteins". The idea that enzymes must be important in hereditary was further reinforced by the work of Garrod on inborn errors of metabolism, which also associated enzymes (specifically, the lack of them) with hereditary factors (Fruton, p431)

  6. Avery's work was sharply criticized in some quarters, most notably by Alfred E. Mirsky, who pointed out that trace contamination by proteins could not be ruled out as an explanation. A great account of the controversy with Mirsky and of the reasons why Avery's work was only slowly accepted is given by Cobb, 2014 [pdf]. One interesting stat from this paper: Hershley and Chase did not quote any of Avery's papers and Hershey is quoted as saying that “I wasn't too impressed by the results myself” (Cobb,2014). This despite the fact that the DNA preparation of Hershey and Chase contained maybe 20% protein, but that of Avery and colleagues contained less than 0.2% (Cobb,2014). A key paper in the controversy (by Mirksy) may be found here [pdf]

J.B.S Haldane stated (in 1942) that "the size of a gene is roughly that of a protein molecule and it is very probable that genes are proteins" (quoted by Fruton, p 430).

So the biochemists, it would seem, felt that proteins were the 'holy grail' of hereditary, and this attitude prevailed in some quarters long after Watson and Crick. Morange (p 232) points out that Arthur Kornberg discovered DNA polymerase in 1958 and was (jointly) awarded the Nobel prize in 1959. Watson and Crick, whose seminal paper on DNA was published in 1953, were not so honored until 1962. That is an interesting, and perhaps very revealing, point.

And speaking of Nobel prizes, Avery did not get one (he died in 1955), perhaps in part due to the controversy with Mirsky (see Cobb). Erwin Chargaff - he of the A/T and G/C ratios - (who accepted Avery's work as of fundamental at a very early stage) is quoted as saying that Avery's work deserved two Nobel prizes. And, of course, Chargaff himself did not get one, which he surely deserved. As well as the famous ratios, Chargaff was one of the first to suggest that "Differences in the proportions or the sequence of the several nucleotides forming the nucleic acid chain also could be responsible for specific effects" (see Cobb).

Fruton (pp 416-417) quotes Francis Crick as saying about the reception of the double helix. "The reaction of many biochemists, including Joseph Fruton, ranged from coolness to muted hostility. They had long considered the biology of the gene to be based on proteins, not nucleic acids, and thought the problem far too difficult to tackle in the immediate future. It did not help that the structure had been put forward by two people who were not obviously card-carrying biochemists". Fruton (p 417) rejects this view and points out that in their classic text book (with Sofia Simmonds) the Watson and Crick structure for DNA was readily accepted, and concludes (p 417) that "I can only surmise that Francis, who showed me much kindness during my stay in Cambridge during 1962 - 1963, may have accepted too readily some idle gossip".

All this and more may be found not only in Fruton and Morange, but also in Judson's great book The Eight Day of Creation (which I don't have to hand at the moment).

Edit

After replying to David's comment, I dug out some further quotes by Sydney Brenner from Biochemistry Strikes Back which are also (somewhat) relevant.

"In years gone by, biochemists viewed most of the rest of biology as a descriptive science and from the even loftier standpoint of physics, Rutherford dismissed it as stamp collecting.

The early days of molecular biology were marked by what seemed to many to be an arrogant cleavage of the new science from biochemistry. People like myself, whose application for admission to the Cambridge Department of Biochemistry was ignored and who did not even receive the attention of a rejection letter, often expressed exaggerated views of our relationship to biochemistry. However, out argument was not concerned with the methods of biochemistry but only with their blindness in ignoring the new field of the chemistry of information". (Brenner, 2000)

(He goes on to say: "I once made the remark that two things disappeared in 1990: one was communism and the other biochemistry and that only one should be allowed back. Of course biochemistry never really went away but continued to flourish in the thousands of unread pages of biochemical journals") (Brenner, 2000) .

Some great references:

Cobb, M (2014) Oswald Avery, DNA, and the transformation of biology Current Biology, Volume 24, pR55-R60 free pdf

Fruton, J. S. (1999) Proteins, Enzymes, Genes. The Interplay of Chemistry and Biology. Yale University Press.

Judson, H. F. (1996) The Eight Day of Creation. Makers of the Revolution in Biology (25th anniversary edn). Cold Spring Harbor Press.

Morange, Michel (1998) A History of Molecular Biology (translated by Matthew Cobb). Harvard University Press


Primary ciliary dyskinesia

Primary ciliary dyskinesia (PCD) is an inherited disorder which affects the movement of tiny hair-like structures on body cells , known as cilia. Cilia are present on many types of cells, and particularly on those in the respiratory tract. In PCD, the cilia are abnormal, and don’t move correctly. People with this disorder cannot clear the mucous and fluid in their lungs and airways. This leads to frequent respiratory infections, and continuous nasal congestion and coughing. In addition, because cilia are involved in how the organs form and develop, many people with PCD may have abnormal placement of the organs in the body, known as situs abnormalities. [1] [2] [3] For example, their heart may be on the right side of their chest instead of the left. Almost all males with PCD are infertile.

PCD is caused by mutations in one of over 30 different genes involved in the formation of cilia, and is usually inherited in an autosomal recessive pattern in families. It is diagnosed based on the clinical symptoms. Other diagnostic tests may include ciliary analysis and genetic testing . Treatment is based on taking care of the symptoms. The long-term outlook for people with PCD depends on the severity of the symptoms. People with frequent lung infections may experience permanent lung damage and require lung transplant. Early diagnosis and treatment may improve the long-term outlook for people with PCD. [4] [5]


Sources of error in molecular diagnostic analyses

Introduction

Molecular diagnostics has undergone a period of rapid development and growth in the last decade. The implementation of new high complexity tests and integration of new technologies into the clinical molecular diagnostics laboratory has been critical towards advancement to the goal of achieving precision medicine. Molecular diagnostics encompasses diverse fields such as infectious disease, genetics, pharmacogenomics, and oncology, however, the underlying principles and potential sources of error are common among these different applications. Common errors that may occur with diagnostic molecular assays are discussed here. As with other sections of the clinical laboratory, a strong program of quality control and quality assurance is necessary for the detection of problems, to monitor errors, and to implement methods to assure quality.


References

Pare JA, Fraser RG, Pirozynski WJ, Shanks JA, Stubington D

Geisterfer-Lowrance AA, Kass S, Tanigawa G, Vosberg HP, McKenna W, Seidman CE, Seidman JG

Green EM, Wakimoto H, Anderson RL, Evanchik MJ, Gorham JM, Harrison BC, Henze M, Kawas R, Oslob JD, Rodriguez HM, et al.

Charron P, Dubourg O, Desnos M, Isnard R, Hagege A, Millaire A, Carrier L, Bonne G, Tesson F, Richard P, et al.

Maron BJ, Spirito P, Wesley Y, Arce J

Maron BJ, Gardin JM, Flack JM, Gidding SS, Kurosaki TT, Bild DE

Zou Y, Song L, Wang Z, Ma A, Liu T, Gu H, Lu S, Wu P, Zhang dagger Y, Shen dagger L, et al.

Bick AG, Flannick J, Ito K, Cheng S, Vasan RS, Parfenov MG, Herman DS, DePalma SR, Gupta N, Gabriel SB, et al.

Semsarian C, Ingles J, Maron MS, Maron BJ

Nijenkamp LLAM, Bollen IAE, Niessen HWM, Dos Remedios CG, Michels M, Poggesi C, Ho CY, Kuster DWD, van der Velden J

Eberly LA, Day SM, Ashley EA, Jacoby DL, Jefferies JL, Colan SD, Rossano JW, Semsarian C, Pereira AC, Olivotto I, et al.

Rowin EJ, Maron MS, Wells S, Patel PP, Koethe BC, Maron BJ

Lakdawala NK, Olivotto I, Day SM, Han L, Ashley EA, Michels M, Ingles J, Semsarian C, Jacoby D, Jefferies JL, et al.

Watkins H, McKenna WJ, Thierfelder L, Suk HJ, Anan R, O’Donoghue A, Spirito P, Matsumori A, Moravec CS, Seidman JG

Niimura H, Bachinski LL, Sangwatanaroj S, Watkins H, Chudley AE, McKenna W, Kristinsson A, Roberts R, Sole M, Maron BJ, et al.

Thierfelder L, Watkins H, MacRae C, Lamas R, McKenna W, Vosberg HP, Seidman JG, Seidman CE

Levy S, Sutton G, Ng PC, Feuk L, Halpern AL, Walenz BP, Axelrod N, Huang J, Kirkness EF, Denisov G, et al.

Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, He W, Chen YJ, Makhijani V, Roth GT, et al.

Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR

Kong A, Frigge ML, Masson G, Besenbacher S, Sulem P, Magnusson G, Gudjonsson SA, Sigurdsson A, Jonasdottir A, Jonasdottir A, et al.

Palamara PF, Francioli LC, Wilton PR, Genovese G, Gusev A, Finucane HK, Sankararaman S, Sunyaev SR, de Bakker PI, Wakeley J, et al.

Besenbacher S, Liu S, Izarzugaza JM, Grove J, Belling K, Bork-Jensen J, Huang S, Als TD, Li S, Yadav R, et al.

Campbell CD, Chong JX, Malig M, Ko A, Dumont BL, Han L, Vives L, O’Roak BJ, Sudmant PH, Shendure J, et al.

Tennessen JA, Bigham AW, O’Connor TD, Fu W, Kenny EE, Gravel S, McGee S, Do R, Liu X, Jun G, et al.

Daw EW, Lu Y, Marian AJ, Shete S

Daw EW, Chen SN, Czernuszewicz G, Lombardi R, Lu Y, Ma J, Roberts R, Shete S, Marian AJ

Alfares AA, Kelly MA, McDermott G, Funke BH, Lebo MS, Baxter SB, Shen J, McLaughlin HM, Clark EH, Babb LJ, et al.

Richard P, Charron P, Carrier L, Ledeuil C, Cheav T, Pichereau C, Benaiche A, Isnard R, Dubourg O, Burban M, et al.

Millat G, Bouvagnet P, Chevalier P, Dauphin C, Jouk PS, Da Costa A, Prieur F, Bresson JL, Faivre L, Eicher JC, et al.

Helms AS, Thompson AD, Glazier AA, Hafeez N, Kabani S, Rodriguez J, Yob JM, Woolcock H, Mazzarotto F, Lakdawala NK, et al.

Vignier N, Schlossarek S, Fraysse B, Mearini G, Krämer E, Pointu H, Mougenot N, Guiard J, Reimer R, Hohenberg H, et al.

Adalsteinsdottir B, Teekakirikul P, Maron BJ, Burke MA, Gudbjartsson DF, Holm H, Stefansson K, DePalma SR, Mazaika E, McDonough B, et al.

Calore C, De Bortoli M, Romualdi C, Lorenzon A, Angelini A, Basso C, Thiene G, Iliceto S, Rampazzo A, Melacini P

Kubo T, Kitaoka H, Okawa M, Matsumura Y, Hitomi N, Yamasaki N, Furuno T, Takata J, Nishinaga M, Kimura A, et al.

Dhandapany PS, Sadayappan S, Xue Y, Powell GT, Rani DS, Nallari P, Rai TS, Khullar M, Soares P, Bahl A, et al.

Harper AR, Bowman M, Hayesmoore JBG, Sage H, Salatino S, Blair E, Campbell C, Currie B, Goel A, McGuire K, et al.

Poetter K, Jiang H, Hassanzadeh S, Master SR, Chang A, Dalakas MC, Rayment I, Sellers JR, Fananapazir L, Epstein ND

Satoh M, Takahashi M, Sakamoto T, Hiroe M, Marumo F, Kimura A

Carniel E, Taylor MR, Sinagra G, Di Lenarda A, Ku L, Fain PR, Boucek MM, Cavanaugh J, Miocic S, Slavov D, et al.

Rubattu S, Bozzao C, Pennacchini E, Pagannone E, Musumeci BM, Piane M, Germani A, Savio C, Francia P, Volpe M, et al.

Mogensen J, Klausen IC, Pedersen AK, Egeblad H, Bross P, Kruse TA, Gregersen N, Hansen PS, Baandrup U, Borglum AD

Osio A, Tan L, Chen SN, Lombardi R, Nagueh SF, Shete S, Roberts R, Willerson JT, Marian AJ

Ruggiero A, Chen SN, Lombardi R, Rodriguez G, Marian AJ

Chiu C, Bagnall RD, Ingles J, Yeates L, Kennerson M, Donald JA, Jormakka M, Lind JM, Semsarian C

Prondzynski M, Lemoine MD, Zech AT, Horváth A, Di Mauro V, Koivumäki JT, Kresin N, Busch J, Krause T, Krämer E, et al.

Geier C, Gehmlich K, Ehler E, Hassfeld S, Perrot A, Hayess K, Cardim N, Wenzel K, Erdmann B, Krackhardt F, et al.

Hayashi T, Arimura T, Itoh-Satoh M, Ueda K, Hohda S, Inagaki N, Takahashi M, Hori H, Yasunami M, Nishi H, et al.

Friedrich FW, Wilding BR, Reischmann S, Crocini C, Lang P, Charron P, Müller OJ, McGrath MJ, Vollert I, Hansen A, et al.

Christodoulou DC, Wakimoto H, Onoue K, Eminaga S, Gorham JM, DePalma SR, Herman DS, Teekakirikul P, Conner DA, McKean DM, et al.

Lim DS, Roberts R, Marian AJ

Arimura T, Matsumoto Y, Okazaki O, Hayashi T, Takahashi M, Inagaki N, Hinohara K, Ashizawa N, Yano K, Kimura A

Chen SN, Czernuszewicz G, Tan Y, Lombardi R, Jin J, Willerson JT, Marian AJ

Salazar-Mendiguchía J, Ochoa JP, Palomino-Doza J, Domínguez F, Díez-López C, Akhtar M, Ramiro-León S, Clemente MM, Pérez-Cejas A, Robledo M, et al.

Auxerre-Plantie E, Nielsen T, Grunert M, Olejniczak O, Perrot A, Özcelik C, Harries D, Matinmehr F, Remedios CD, Mühlfeld C, et al.

Matsushita Y, Furukawa T, Kasanuki H, Nishibatake M, Kurihara Y, Ikeda A, Kamatani N, Takeshima H, Matsuoka R

Landstrom AP, Adekola BA, Bos JM, Ommen SR, Ackerman MJ

Chiu C, Tebo M, Ingles J, Yeates L, Arthur JW, Lind JM, Semsarian C

Valdés-Mas R, Gutiérrez-Fernández A, Gómez J, Coto E, Astudillo A, Puente DA, Reguero JR, Álvarez V, Morís C, León D, et al.

Gomez J, Lorca R, Reguero JR, Morís C, Martín M, Tranche S, Alonso B, Iglesias S, Alvarez V, Díaz-Molina B, et al.

Almomani R, Verhagen JM, Herkert JC, Brosens E, van Spaendonck-Zwarts KY, Asimaki A, van der Zwaag PA, Frohn-Mulder IM, Bertoli-Avella AM, Boven LG, et al.

Li L, Bainbridge MN, Tan Y, Willerson JT, Marian AJ

Hayashi T, Arimura T, Ueda K, Shibata H, Hohda S, Takahashi M, Hori H, Koga Y, Oka N, Imaizumi T, et al.

Saltzman AJ, Mancini-DiNardo D, Li C, Chung WK, Ho CY, Hurst S, Wynn J, Care M, Hamilton RM, Seidman GW, et al.

Page SP, Kounas S, Syrris P, Christiansen M, Frank-Hansen R, Andersen PS, Elliott PM, McKenna WJ

Jääskeläinen P, Heliö T, Aalto-Setälä K, Kaartinen M, Ilveskoski E, Hämäläinen L, Melin J, Kärkkäinen S, Peuhkurinen K, Nieminen MS, et al.

Jääskeläinen P, Heliö T, Aalto-Setälä K, Kaartinen M, Ilveskoski E, Hämäläinen L, Melin J, Nieminen MS, Laakso M, Kuusisto J, et al.

Alders M, Jongbloed R, Deelen W, van den Wijngaard A, Doevendans P, Ten Cate F, Regitz-Zagrosek V, Vosberg HP, van Langen I, Wilde A, et al.

Blair E, Price SJ, Baty CJ, Ostman-Smith I, Watkins H

Girolami F, Ho CY, Semsarian C, Baldi M, Will ML, Baldini K, Torricelli F, Yeates L, Cecchi F, Ackerman MJ, et al.

Nagueh SF, McFalls J, Meyer D, Hill R, Zoghbi WA, Tam JW, Quiñones MA, Roberts R, Marian AJ

Nagueh SF, Bachinski LL, Meyer D, Hill R, Zoghbi WA, Tam JW, Quiñones MA, Roberts R, Marian AJ

Ameur A, Kloosterman WP, Hestand MS

Rottbauer W, Gautel M, Zehelein J, Labeit S, Franz WM, Fischer C, Vollrath B, Mall G, Dietz R, Kübler W, et al.

Marston S, Copeland O, Jacques A, Livesey K, Tsang V, McKenna WJ, Jalilzadeh S, Carballo S, Redwood C, Watkins H

Tripathi S, Schultz I, Becker E, Montag J, Borchert B, Francino A, Navarro-Lopez F, Perrot A, Özcelik C, Osterziel KJ, et al.

Kraft T, Montag J, Radocaj A, Brenner B

Frischmeyer PA, van Hoof A, O’Donnell K, Guerrerio AL, Parker R, Dietz HC

Siwaszek A, Ukleja M, Dziembowski A

Sarikas A, Carrier L, Schenke C, Doll D, Flavigny J, Lindenberg KS, Eschenhagen T, Zolk O

Schlossarek S, Frey N, Carrier L

Stewart MA, Franks-Skiba K, Chen S, Cooke R

Anderson RL, Trivedi DV, Sarkar SS, Henze M, Ma W, Gong H, Rogers CS, Gorham JM, Wong FL, Morck MM, et al.

McNamara JW, Li A, Smith NJ, Lal S, Graham RM, Kooiker KB, van Dijk SJ, Remedios CGD, Harris SP, Cooke R

Witjas-Paalberends ER, Piroddi N, Stam K, van Dijk SJ, Oliviera VS, Ferrara C, Scellini B, Hazebroek M, ten Cate FJ, van Slegtenhorst M, et al.

Witjas-Paalberends ER, Güçlü A, Germans T, Knaapen P, Harms HJ, Vermeer AM, Christiaans I, Wilde AA, Dos Remedios C, Lammertsma AA, et al.

Bloemink M, Deacon J, Langer S, Vera C, Combs A, Leinwand L, Geeves MA

Lombardi R, Bell A, Senthil V, Sidhu J, Noseda M, Roberts R, Marian AJ

Nagueh SF, Chen S, Patel R, Tsybouleva N, Lutucuta S, Kopelen HA, Zoghbi WA, Quiñones MA, Roberts R, Marian AJ

Solaro RJ, Varghese J, Marian AJ, Chandra M

Crilley JG, Boehm EA, Blair E, Rajagopalan B, Blamire AM, Styles P, McKenna WJ, Ostman-Smith I, Clarke K, Watkins H

Kraft T, Witjas-Paalberends ER, Boontje NM, Tripathi S, Brandis A, Montag J, Hodgkinson JL, Francino A, Navarro-Lopez F, Brenner B, et al.

Gupte TM, Haque F, Gangadharan B, Sunitha MS, Mukherjee S, Anandhan S, Rani DS, Mukundan N, Jambekar A, Thangaraj K, et al.

Cordero-Reyes AM, Youker K, Hamilton DJ, Torre-Amione G, Marian AJ, Nagueh SF

Litviňuková M, Talavera-López C, Maatz H, Reichart D, Worth CL, Lindberg EL, Kanda M, Polanski K, Heinig M, Lee M, et al.

Fraysse B, Weinberger F, Bardswell SC, Cuello F, Vignier N, Geertz B, Starbatty J, Krämer E, Coirault C, Eschenhagen T, et al.

Sequeira V, Wijnker PJ, Nijenkamp LL, Kuster DW, Najafi A, Witjas-Paalberends ER, Regan JA, Boontje N, Ten Cate FJ, Germans T, et al.

O’Hanlon R, Grasso A, Roughton M, Moon JC, Clark S, Wage R, Webb J, Kulkarni M, Dawson D, Sulaibeekh L, et al.

Schelbert EB, Piehler KM, Zareba KM, Moon JC, Ugander M, Messroghli DR, Valeti US, Chang C-CH, Shroff SG, Diez J, et al.

Helms AS, Alvarado FJ, Yob J, Tang VT, Pagani F, Russell MW, Valdivia HH, Day SM

Teekakirikul P, Eminaga S, Toka O, Alcalai R, Wang L, Wakimoto H, Nayor M, Konno T, Gorham JM, Wolf CM, et al.

Senthil V, Chen SN, Tsybouleva N, Halder T, Nagueh SF, Willerson JT, Roberts R, Marian AJ

Patel R, Nagueh SF, Tsybouleva N, Abdellatif M, Lutucuta S, Kopelen HA, Quinones MA, Zoghbi WA, Entman ML, Roberts R, et al.

Lim DS, Lutucuta S, Bachireddy P, Youker K, Evans A, Entman M, Roberts R, Marian AJ

Nagueh SF, Kopelen HA, Lim DS, Zoghbi WA, Quiñones MA, Roberts R, Marian AJ

Kramer CM, Reichek N, Ferrari VA, Theobald T, Dawson J, Axel L

Urbano-Moral JA, Rowin EJ, Maron MS, Crean A, Pandian NG

Monserrat L, Elliott PM, Gimeno JR, Sharma S, Penas-Lado M, McKenna WJ

O’Mahony C, Jichi F, Pavlou M, Monserrat L, Anastasakis A, Rapezzi C, Biagini E, Gimeno JR, Limongelli G, McKenna WJ, et al.

Olivotto I, Cecchi F, Casey SA, Dolara A, Traverse JH, Maron BJ

Siontis KC, Geske JB, Ong K, Nishimura RA, Ommen SR, Gersh BJ

Debonnaire P, Joyce E, Hiemstra Y, Mertens BJ, Atsma DE, Schalij MJ, Bax JJ, Delgado V, Marsan NA

Marian AJ, Asatryan B, Wehrens XHT

Eriksson MJ, Sonnenberg B, Woo A, Rakowski P, Parker TG, Wigle ED, Rakowski H

Klarich KW, Attenhofer Jost CH, Binder J, Connolly HM, Scott CG, Freeman WK, Ackerman MJ, Nishimura RA, Tajik AJ, Ommen SR

Kubo T, Ochi Y, Baba Y, Sugiura K, Takahashi A, Hirota T, Yamanaka S, Yamasaki N, Doi YL, Kitaoka H

Bagnall RD, Weintraub RG, Ingles J, Duflou J, Yeates L, Lam L, Davis AM, Thompson T, Connell V, Wallace J, et al.

Maron BJ, Doerer JJ, Haas TS, Tierney DM, Mueller FO

Sherrid MV, Barac I, McKenna WJ, Elliott PM, Dickie S, Chojnowska L, Casey S, Maron BJ

Maron BJ, Spirito P, Ackerman MJ, Casey SA, Semsarian C, Estes NA, Shannon KM, Ashley EA, Day SM, Pacileo G, et al.

Nishimura RA, Seggewiss H, Schaff HV

Olivotto I, Oreziak A, Barriales-Villa R, Abraham TP, Masri A, Garcia-Pavia P, Saberi S, Lakdawala NK, Wheeler MT, Owens A, et al.

Ho CY, Mealiffe ME, Bach RG, Bhattacharya M, Choudhury L, Edelberg JM, Hegde SM, Jacoby D, Lakdawala NK, Lester SJ, et al.

Tsybouleva N, Zhang L, Chen S, Patel R, Lutucuta S, Nemoto S, DeFreitas G, Entman M, Carabello BA, Roberts R, et al.

Marian AJ, Senthil V, Chen SN, Lombardi R

Shimada YJ, Passeri JJ, Baggish AL, O’Callaghan C, Lowry PA, Yannekis G, Abbara S, Ghoshhajra BB, Rothman RD, Ho CY, et al.

Nagueh SF, Lombardi R, Tan Y, Wang J, Willerson JT, Marian AJ

Maron MS, Chan RH, Kapur NK, Jaffe IZ, McGraw AP, Kerur R, Maron BJ, Udelson JE

Marian AJ, Tan Y, Li L, Chang J, Syrris P, Hessabi M, Rahbar MH, Willerson JT, Cheong BY, Liu CY, et al.

Olivotto I, Camici PG, Merlini PA, Rapezzi C, Patten M, Climent V, Sinagra G, Tomberli B, Marin F, Ehlermann P, et al.

Ho CY, Lakdawala NK, Cirino AL, Lipshultz SE, Sparks E, Abbasi SA, Kwong RY, Antman EM, Semsarian C, González A, et al.


Application of Genetic Testing in Hypertrophic Cardiomyopathy for Preclinical Disease Detection

From the Agnes Ginges Centre for Molecular Cardiology, Centenary Institute, Sydney NSW, Australia (J.I., C.B., C.S.) Central Clinical School, Sydney Medical School, University of Sydney, Sydney NSW, Australia (J.I., C.B., C.S.) School of Population Health, Sydney Medical School, University of Sydney, Sydney NSW, Australia (A.B.) and Department of Cardiology, Royal Prince Alfred Hospital, Sydney NSW, Australia (J.I., C.B., C.S.).

From the Agnes Ginges Centre for Molecular Cardiology, Centenary Institute, Sydney NSW, Australia (J.I., C.B., C.S.) Central Clinical School, Sydney Medical School, University of Sydney, Sydney NSW, Australia (J.I., C.B., C.S.) School of Population Health, Sydney Medical School, University of Sydney, Sydney NSW, Australia (A.B.) and Department of Cardiology, Royal Prince Alfred Hospital, Sydney NSW, Australia (J.I., C.B., C.S.).

From the Agnes Ginges Centre for Molecular Cardiology, Centenary Institute, Sydney NSW, Australia (J.I., C.B., C.S.) Central Clinical School, Sydney Medical School, University of Sydney, Sydney NSW, Australia (J.I., C.B., C.S.) School of Population Health, Sydney Medical School, University of Sydney, Sydney NSW, Australia (A.B.) and Department of Cardiology, Royal Prince Alfred Hospital, Sydney NSW, Australia (J.I., C.B., C.S.).

From the Agnes Ginges Centre for Molecular Cardiology, Centenary Institute, Sydney NSW, Australia (J.I., C.B., C.S.) Central Clinical School, Sydney Medical School, University of Sydney, Sydney NSW, Australia (J.I., C.B., C.S.) School of Population Health, Sydney Medical School, University of Sydney, Sydney NSW, Australia (A.B.) and Department of Cardiology, Royal Prince Alfred Hospital, Sydney NSW, Australia (J.I., C.B., C.S.).

Hypertrophic cardiomyopathy (HCM) is the most common inherited cardiovascular diseases, with a prevalence of at least 1 in 500 in the general population. 1,2 HCM is characterized by left ventricular hypertrophy, in the absence of other loading conditions, such as hypertension. 3 The hallmark feature of HCM is significant clinical heterogeneity in presentation, ranging from asymptomatic patients to those who have the most serious outcomes of heart failure and sudden cardiac death.

Over 1500 mutations in at least 15 sarcomere-encoding genes have been identified. 4–7 The significance of cardiac genetic testing in clinical practice is 2-fold. For the proband, identification of the underlying genetic cause in some cases can clarify the cause of hypertrophy, for example, clarifying phenocopies, such as PRKAG2-glycogen storage disease and Fabry disease. The greatest utility, however, is in cascade genetic testing of asymptomatic relatives, with clear benefits either for confirming a borderline clinical diagnosis, or suspicious clinical changes suggestive of early disease, or most importantly ruling out the disease in those who test gene-negative. Identification of a silent gene carrier will guide cascade testing of additional family members, in effect clarifying their risk status. Of most benefit, a negative genetic result can reassure offspring that they are not at risk of HCM.

The escalation in our understanding of the genetic basis of HCM has been catalyzed by the implementation of next generation sequencing technologies. In response to faster and more affordable testing, commercial genetic testing for HCM now often comprises vast cardiac gene chips (ie, 50–200 or more genes). This approach, although comprehensive, also draws into sharp focus the limitations of our current knowledge. The challenges of cardiac genetic testing are increasingly documented, such as identification of variants of uncertain significance (VUS), incidental genetic findings, 8 reclassification of variants, 9 increased need for pretest genetic counseling, 10 and appropriate initiation of treatments in gene carriers.

Despite the challenges, the benefit of knowing with some certainty that a family member does not carry the family mutation and can be released from clinical surveillance and worry cannot be underestimated. What, on the other hand, does it mean for the asymptomatic relative who is told they are a gene carrier? Are we presented with a unique opportunity to detect early preclinical disease, allowing therapeutic intervention to prevent the worst outcomes? Or are we inflicting unnecessary harm and worry on an individual who would never have developed clinically significant disease? In our eagerness to prevent and treat disease early, are we overlooking the increasing burden on healthcare resources and costs? The impact of cascade genetic testing measured by benefits and harms remains largely unknown (Figure 1), but should be a key component in guiding the most effective and appropriate use of this technology.

Figure 1. Weighing up the advantages and disadvantages of preclinical disease detection for hypertrophic cardiomyopathy. HR-QoL indicates health-related quality of life.

The overuse of diagnostic tests and subsequent overdiagnosis is increasingly recognized as a major side effect of technological advances. This spans all medical disciplines and is frequently seen in the setting of new emerging technologies, such as new imaging modalities for cancer detection, or the inappropriate use of population screening tests, such as prostate-specific antigen testing in early detection of prostate cancer. 11 In 2011, it is estimated that between $158 billion and $226 billion was spent on unnecessary low-value healthcare in the United States. 12 Initiatives such as Choosing Wisely challenge clinicians and patients to question whether the test or procedure ordered is supported by evidence, free from harm, and truly necessary. 13

This review aims to provide a basis for the importance of assessing benefits and harms of new technologies and to outline the evidence for the application of HCM genetic testing as a predictive tool for asymptomatic family members. There is no doubt genetic testing when used appropriately provides a powerful tool for diagnosis and management. Adequate consideration of potential harms, however, will ensure we continue to use this technology in an effective and sustainable way. We suggest practical measures that may minimize potential harms and allow the positive aspects of a genetic diagnosis to be realized.

New Genetic Testing Technologies and Complexity of Genetic Data

Throughout the history of medicine, there have been key discoveries that have revolutionized the care of patients. This includes the application of new technologies offering earlier and more accurate disease detection, from the discovery of the diagnostic ultrasound to cardiac magnetic resonance imaging, allowing visualization of cardiac structures in far greater detail than one could have previously imagined. The ability to more accurately diagnose structural heart disease means appropriate management strategies can be initiated with overall improved treatment of the patient. Although this is the ideal scenario, not all new technologies afford the same overall benefits, and indeed, some introduce unnecessary harms. Assessment of new technologies before their widespread integration into clinical practice is essential for a sustainable health system, with a view to defining how these might be used judiciously, taking advantage of the benefits while minimizing potential harms.

Harms can arise where a diagnostic test is performed without evidence of benefit (overuse), leading to healthy people being labeled as at risk or diseased (overdiagnosis), and more being treated unnecessarily (overtreatment). The net effect is considerable healthcare expenditure for little or no health benefit. In a climate of tightening healthcare spending, these resources could be allocated to more worthwhile purposes. There are also additional costs to consider, including both psychosocial and physical harms to the individual. In some cases, incidental findings from testing may also arise, that is findings not associated with the specific disease but often requiring further investigation. Although there is potential for harms as a result of many new diagnostic tests, awareness and strict guidelines around appropriate use are critical.

In HCM, next-generation sequencing technologies have transformed our understanding of the genetic basis of disease and the options available to our patients. The basis of the technology, massively parallel sequencing, has significantly reduced the time and costs required to perform genetic tests. 14 Although the application of these technologies to clinical-based cardiac genetic testing panels has progressed rapidly, our ability to interpret the variant data generated is lagging because of a basic lack of understanding of the biological mechanisms underpinning the disease genes and genetic variants identified. Over 80 million genetic variants have already been reported in the human genome (www.1000genomes.org). In one healthy person, over 200 rare potentially disease-causing variants and between 12 and 20 reportable variants might be identified. 15,16 Interpreting variant information for clinical use is, therefore, problematic. Among the core HCM genes, there is a low tolerance for genetic variation, 17 and thus, the number of variants identified per sample is generally small however, this is not necessarily true for many other cardiac expressed genes included in large panels. The probabilistic nature of variant interpretation is well described, 10,18 where genetic results are considered along a continuum from benign to VUS, likely pathogenic, and pathogenic rather than a binary (yes/no) outcome (Table). Conveying this result to the family can be a challenge, and pretest genetic counseling is vital to ensure the consequences, and potential outcomes of genetic tests are understood before testing is ordered. 19,20

Table. The Range of Genetic Testing Outcomes for the Proband and Family

HCM indicates hypertrophic cardiomyopathy and VUS: variant of uncertain significance. Adapted from Ingles et al. 10

Most recently, the American College of Medical Genetics and Genomics has devised guidelines for determining pathogenicity of sequence variants in Mendelian disease. Evidence for causation is based around many factors, including rarity in population databases, identification in multiple unrelated patients with matching phenotypes, cosegregation with clinical disease, agreement between in silico models and conservation scores of an interruption to the protein function, and others. 21 A move to more stringent variant classification has highlighted the need for collaborative efforts to centralize variant data and maximize the prospect of assigning causation, information that is clinically relevant to the family. The National Institutes of Health–funded Clinical Genome Resource (ClinGen) 22 provides a platform to achieve this on a global scale, with a key goal to share data and knowledge about genetic variants via the publicly available ClinVar site (www.clinvar.com).

Early Genetic Diagnosis and Potential for Disease Prevention

Case 1: A healthy 10-year-old female suffers a cardiac arrest while running to catch a train. She is successfully resuscitated and on cardiac investigation is found to have HCM with asymmetric left ventricular hypertrophy of 22 mm. Her family is informed that they have a 50% inheritance risk and undergo clinical surveillance. Her father is found to be affected, whereas her mother and brothers show no evidence of disease. Genetic testing is performed, and a gene variant MYBPC3 NM_000256.3:c.1484G>A, p.Arg495Gln is identified. This variant has been reported previously in several unrelated HCM patients 23,24 and is seen at a low frequency in the Exome Aggregation Consortium database (ExAC, http://exac.broadinstitute.org) and in silico prediction models/conservation scores are in agreement and support a deleterious role. Furthermore, sequence changes of Arg495Gly and Arg495Trp have also been reported in HCM patients. Based on this evidence, the variant was classified as likely pathogenic, and cascade genetic testing was offered to family members. After careful pretest genetic counseling, her 16-year-old brother, a competitive basketball player, is identified as a gene carrier despite normal clinical investigations.

Disease prevention remains a cornerstone of medicine and the intense focus of public health efforts. Early detection of disease, particularly in the preclinical asymptomatic phase, affords the opportunity to intervene with prevention strategies or treatments to reduce overall disease morbidity and mortality. In 1968, Wilson and Jungner wrote, “The central idea of early disease detection and treatment is essentially simple. However, the path to its successful achievement (on the one hand, bringing to treatment those with previously undetected disease, and, on the other, avoiding harm to those persons not in need of treatment) is far from simple though sometimes may appear deceptively easy”. 25 Intuitively, early detection of disease is likely to improve health outcomes. For example, early detection of high cholesterol and subsequent lifestyle modification and pharmacological therapy has been shown to reduce both cardiovascular morbidity and mortality. However, several recent examples, such as prostate-specific antigen screening in prostate cancer, have challenged the generalizability of such approaches. Without evidence of benefit, there is growing recognition that we may be creating new healthy patient populations, that is, the worried well or patients in waiting.

Preclinical detection of HCM is routinely performed worldwide as a result of cascade genetic testing of asymptomatic at-risk family members. In a disease with marked clinical heterogeneity that often develops during or after adolescence, the clinical significance of a positive gene result is entirely dependent on the age of the family member. For those identified as children, there is a greater likelihood of the clinical phenotype developing, though currently no therapies to prevent or alter the development of left ventricular hypertrophy exist in clinical practice. For those tested beyond this age, that is, adults, the approach to managing these people has evolved during the last decade, moving from a conservative view that they are part of the spectrum of HCM and should be treated and restricted from high-level exercise to a more contemporary strategy of watchful waiting.

Little is known about the natural history and outcomes of silent gene carriers, and this has created several clinical dilemmas, including frequency of clinical surveillance and whether individuals should be restricted from sports. 26 Small retrospective studies suggest silent gene carriers have a benign clinical course, associated with a low risk of sudden cardiac death, and a low probability of developing manifest disease once reaching adulthood. 26–28 The largest patient series to date examined 339 silent gene carriers. 28 Of those who received follow-up clinical investigations during the study period, 29 of 162 (18%) had developed a phenotype. Overall, the low incidence of serious cardiac events in those without manifest HCM provides a rationale for less frequent clinical evaluation. 28 Clinically, the key recommendation for this group is lifelong clinical surveillance, though there is disagreement whether they should participate in high-level competitive sports, and therefore, the recommendation is to consider this on a case-by-case basis. 3,29,30

The psychological impact of being at risk of HCM has been studied 31–33 however, the impact in silent gene carriers is less well understood. Anecdotally, the clinic experience is that provided there is good support and pretest genetic counseling, most patients adjust well to a positive carrier status. In a series of 89 silent gene carriers, self-reported health-related quality of life and psychological well-being (anxiety and depression) were either similar or significantly better than general population normative data. 34 Indeed, the strongest indicators of poor physical quality of life was development of manifest disease and perceived negative consequences of gene carriership. The motivation for pursuing cascade genetic testing among family members has also been investigated. Christiaans et al surveyed mutation carriers regarding their experience of undergoing cascade genetic testing, with 90% reporting “because it is hereditary and I wanted to know” 87%, “because I wanted to know for myself” and 67%, “because of my children” as reasons for undergoing testing. 35 There were 95% of participants who were satisfied with the process of genetic testing in the multidisciplinary cardiac genetic setting, and only 4% indicated that they regretted finding out they are a mutation carrier.

In a disease with no cure that leads to seemingly irreversible hypertrophy, fibrosis, and disarray of cardiomyocytes in the left ventricle, the long-term goal of research has always been focused around disease prevention in HCM. In this setting, effective identification of healthy individuals who are likely to convert to an HCM phenotype are highly sought after with the potential to provide a larger therapeutic window to initiate early prevention strategies. These individuals represent a real opportunity to develop a cure. Studies in animal models of HCM have shown that early treatment with pharmacological agents, such as diltiazem 36 and losartan, 37 may prevent the onset of disease or delay progression. Most recently, preliminary studies in silent gene carriers with HCM have shown that preclinical therapy may improve early left ventricular remodeling in HCM, 37 with several other randomized control studies currently in progress (such as HALT [Hypertrophic Regression with N-Acetylcysteine in HCM], LIBERTY [Effect of Eleclazine (GS-6615) on Exercise Capacity in Subjects With Symptomatic HCM], and VANISH [Valsartan for Attenuating Disease Evolution in Early Sarcomeric HCM] studies http://clinicaltrials.gov). The INHERIT (Inhibition of the Renin Angiotensin System with Losartan in Patients with HCM) study, a recent randomized trial of losartan, an angiotensin receptor blocker which has previously shown promising results in transgenic mice, 38 showed no regression of disease however, it highlighted some important points. 39 First, feasibility of large multicenter trials that were once considered impossible for a relatively rare disease have been successfully demonstrated. Second, the preclinical disease stage is further highlighted as the critical window for effective intervention, and indeed, the VANISH trial is focused on this time point. Rather than using regression of disease as a primary end point, this trial aims to prevent disease onset, which if achieved will be an exciting step forward.

Benefit of Eliminating Risk and Uncertainty

Case 2: A 42-year-old male who was diagnosed with HCM as a teenager collapsed while exercising and could not be resuscitated. He was a fit man and did not regularly see his cardiologist as he never felt unwell. Because of limited access to genetic testing in 2002, the decedent’s 44-year-old brother attended for ongoing periodic clinical screening. These investigations have not shown evidence of disease, but this does not alleviate his anxiety regarding the risk to his children. Research genetic testing recently identified a known pathogenic variant in MYBPC3 NM_000256.3:c.772G>A, p.Glu258Lys. This variant has been reported in numerous unrelated HCM probands, has been shown to cosegregate with disease in multiple families, and impacts the splice consensus sequence at the end of exon 6, leading to a premature truncation of the protein. 40,41 The brother underwent comprehensive pretest genetic counseling and elected to pursue cascade genetic testing, where he was found to be gene-negative. He had overwhelming relief to know with certainty that he and his children would not suffer the same fate as his brother. Further, no ongoing costs and use of resources would be incurred pursuing lifetime clinical screening.

The most overwhelming and clinically relevant benefit of a genetic result is for the family 4 and, more specifically, in identifying family members who do not carry the mutation following cascade genetic testing. Genetic testing, as opposed to many other technologies, offers the unique opportunity to truly rule out disease risk because of a specific variant. This can be helpful in managing the family, with the ability to cease ongoing clinical investigations and to alleviate uncertainty and worry. Not surprisingly, health economic analyses to determine the incremental effectiveness of HCM genetic testing over clinical screening alone found gene-negative individuals to be one of the key reasons for the overall favorable cost-effectiveness. 42,43 Releasing a family member from clinical screening requires considerable confidence in the pathogenicity of the variant identified in the proband and highlights the critical importance of robust variant classification methods. Where a variant is uncertain, it should not be used for cascade genetic testing of family members (Table). Importantly, the patient should be reminded that their risk of developing disease is not zero and that a low risk remains (ie, same as the general population) because we cannot exclude the presence of untested modifier variants.

Reclassification of Variants and Impact on Clinical Relevance

Case 3: A 52-year-old man with a longstanding diagnosis of HCM and known family history of disease underwent genetic testing. Two variants were identified, a known pathogenic frameshift in MYBPC3 NM_000256.3: c.2864_2865delCT, p.Pro955Argfs, reported in multiple unrelated HCM probands and causing a premature truncation of the protein, 44 and a deletion in TCAP NM_003673.3: c.37_39delGAG, p.Glu13del, reported at the time as likely pathogenic. The proband’s sister aged 40 years and with no clinical evidence of disease underwent pretest genetic counseling and decided to pursue cascade genetic testing. Given there were 2 variants, there was a 3 in 4 chance she would inherit at least 1 of the variants. 45 She was found to carry only the TCAP deletion, that is, a deletion of a single amino acid, and although it has been previously reported in unrelated HCM probands, it is also seen at a frequency of 2% of the National Heart, Lung, and Blood Institute Exome Sequencing Project and other reported control populations, suggesting a likely benign role. 46 The reclassification of this variant meant the proband’s sister’s carrier status was incorrect and that she and her children were in fact not at risk of disease.

With new knowledge arises change in our understanding of disease causation and pathogenesis. Although variant interpretation and classification systems have been developed and refined, they need to take into account rapidly evolving human genetic databases, such as the ExAC database, 1000 Genomes Project, Exome Variant Server, and most recently, the Genomics England 100 000 Genomes Project. All of these databases have arisen because of the dramatic and rapid advances in genetic technologies. Although once we would be satisfied seeing a potentially pathogenic variant being absent from 200 control alleles, we now often compare to over 50 000 control exomes and genomes. As a result, over time, variant status can change, being downgraded from pathogenic to VUS or benign or, in some instances, upgraded from VUS to pathogenic. This may parallel the notion of a false positive, not in the sense of a technical sequencing faIse positive, but a variant interpretation which initially was conveyed to the patient as a positive result but subsequently found to be false. In HCM, this reclassification has been estimated to occur in at least 10% of families 9 though with more stringent variant classification criteria in use, this might be expected to be much less frequent in future. Nevertheless, it highlights the importance of periodic re-evaluation (every 1–2 years) of all variants in the setting of rapidly emerging general population data.

Clinical Implications of Incidental Gene Variant Findings

Case 4: A family presents to a specialized cardiac genetic clinic after the identification of an incidental genetic finding in AKAP9 NM_005751.4:c.4342A>G, p.Ile1448Val during whole exome sequencing for investigation of benign hereditary chorea. AKAP9 is the gene responsible for LQT11, 47 and the finding of a VUS in a potentially lethal cardiac arrhythmia syndrome precipitates an array of cardiac investigations and segregation studies among the family members. No family members are found to have any clinical evidence of long QT syndrome and were reassured that the AKAP9 variant is unlikely to be of clinical importance. No further investigation was recommended. Three years on, this variant is known to be present in the general population at a frequency of 0.05% in the ExAC database, and the variant has been downgraded to a likely benign status.

Discovery of a clinically actionable finding during investigation for an unrelated problem is not a new concept in medicine. The example of a lung lesion identified during routine imaging of the cardiac structures is well described. An ethical dilemma ensues, where issues may arise regarding the patient’s preferences to receiving the unexpected information, the uncertainty and fear generated, and the medical professional’s flow-on decisions that likely lead the patient down a pathway of additional testing. Our reflex response to such findings is often overwhelming gratitude that disease was detected early and an eagerness to do everything possible to avoid unfavourable outcomes, both by the patient and the clinician. This existential fear of death drives much of the response to an uncertain diagnosis, 48 and for clinicians, there may be additional fear of litigation if they fail to do everything possible. The response to an incidental finding is, therefore, just as much a part of the problem. In our example of an incidental lung lesion identified during routine imaging, although the pathway of investigation that follows is contentious, it is a well-trodden track. Incidental findings in genetic medicine pose much greater uncertainty, with little knowledge about whether the variant is truly capable of causing disease. 49

Incidental genetic findings are an increasing reality in cardiac genetics and present an ethical challenge. 50,51 Although a clinical incidental finding may be resolved quickly (eg, a liver lesion detected during a lung computed tomographic scan is quickly resolved as a benign cyst on liver ultrasound imaging), incidental genetic findings may potentially remain unresolved for several years because knowledge about the gene or the specific mutation is poorly understood. In the most extreme circumstances, incidental genetic findings will include reporting of important variants in noncardiac genes (eg, familial cancer genes) and, therefore, result in the patient being given information completely unrelated to the purpose of the genetic test. It is difficult to argue that these are in fact unexpected or incidental results when a whole exome sequencing or whole genome sequencing approach of screening all 22 000 genes is used, and for this reason, there is a move to recognize these as secondary genetic results 49 and highlights the role of pretest genetic counseling 10,52 and informed consent. 53

The American College of Medical Genetics and Genomics’s recommendations for the reporting of incidental findings in clinical exome and genome sequencing controversially provide a list of 56 genes in which actionable variants should be reported back to the patient, regardless of the purpose of the test and the patient’s preferences for knowing this information. 51 These reportable variants pose a major clinical challenge, with between 12 and 20 clinically reportable variants identified per healthy individual, 15 and in 3% to 5% of patients referred for investigation of a disease phenotype. 54

The issue of incidental genetic findings is not just limited to whole exome sequencing/whole genome sequencing approaches. Cardiac gene panel testing can also present the challenges of unexpected findings. More often, probands who have a genetic test for HCM will have variants in other cardiac genes reported, clouding the value of the result. This phenomenon results in confusion for the patient, but also adds complexity to the response of the clinician in investigating these results. For example, an HCM patient who returns a VUS result in the DSP gene may pose a clinical dilemma. DSP is a desmosome gene associated with arrhythmogenic right ventricular cardiomyopathy, a disease clinically distinct to HCM. Thus, a novel, rare missense variant in DSP is unlikely to be the cause of disease in a person with HCM, and indeed, this gene is known to be somewhat tolerant to variation. 55 Despite this, there is a trend to pursue additional cardiac investigations in these families, in search of a phenotype that might explain the uncertain genetic result. Strategies to minimize incidental and uncertain findings (ie, more targeted gene panels) and family management in high-volume specialized clinics with expertize in HCM genetics will certainly minimize downstream delivery of low-value and costly health care.

A Balanced Approach to HCM Genetic Testing: 6 Key Points

Although there may be a lack of evidence to suggest there is overall benefit for early disease detection in HCM, the purpose of this review is not to dishearten clinicians from the true value of genetic testing in HCM and other inherited cardiac diseases. We hope to have shone a spotlight on the potential limitations, so that this technology can be used in the most effective and appropriate way, with the overall goal to effect some improvement in clinical care. The following 6 points outline the key practical clinical implications that serve as a guide to avoiding some pitfalls of genetic testing and minimizing harms to the patient (Figure 2).

Figure 2. Six points for minimizing potential harms of hypertrophic cardiomyopathy genetic testing. VUS indicates variant of uncertain significance.

Choose the Appropriate Genetic Test

With the multitude of genetic tests commercially available, there is great importance in choosing the most appropriate test. Confining your analysis to a smaller number of genes (ie, targeted panels over more broad approaches such as clinical exome/genome sequencing) will reduce the number of uncertain and incidental variants. In the setting of a proband with a confirmed clinical diagnosis of HCM, screening of 10 to 15 genes is entirely adequate. Performing genetic tests that encompass more genes in this instance needs to be questioned, though it is becoming increasingly difficult as laboratories continue to expand the available gene panels. An important exception, however, are families with multiple affected individuals, allowing segregation of uncertain variants to clarify causation, though this approach may be better suited to the research setting.

Be Confident That the Variant Identified in the Proband Is Actually Disease-Causing

If there is any doubt regarding the pathogenicity of the variant in the proband, no cascade genetic testing should be undertaken in the family. Such VUS findings require further assessment and re-evaluation and often form the basis of ongoing research studies. For the clinician without expertize in genetics, there are some general aspects of a variant that might act as a red flag for questioning a result. First, the variant being reported should be in a gene previously implicated in HCM. Second, the variant in many cases should have been reported previously in unrelated probands with confirmed HCM, though exceptions to this might occur with radical mutations, leading to loss of function in a gene known to result in disease, for example, MYBPC3. Third, the variant should be exceedingly rare in general population databases, and this can be re-evaluated as new larger data sets are published online. These are not fail safe criteria, but should serve as a guide to questioning a laboratory’s clinical reporting of variants. If we are to realize any of the benefits of HCM genetic testing, basing the cascade genetic testing on incorrect gene variant results must be avoided.

Not Every Family Member Needs Cascade Genetic Testing

Cascade genetic testing of at-risk family members needs to be considered on a case-by-case basis. Weighing up the potential benefits with the possibility of causing harm is essential. Key to this decision is the preferences of the family member. How a lay person can understand and make decisions regarding complex genetic information is not well studied, but must become a priority if we hope to facilitate informed choice. For many, they may be content to attend for clinical investigation periodically, knowing that no treatments would be initiated unless there was evidence of clinical disease. Potential for insurance or employment discrimination may also be considered, though will differ between countries however, legislation, such as the Genetic Information Non-Discrimination Act (GINA), 56 highlights how the United States of America has worked to minimize some of these potential harms. Detailed pretest genetic counseling is essential in facilitating this decision however, future research to elucidate effective methods to promote informed choice will be needed.

An area where informed choice is problematic is that of testing children. Genetic societies worldwide have acknowledged this, and many have developed strict guidelines on processes around testing children, with the key premise being that there should be some imminent health benefit. 57 In HCM, it can be argued whether a positive genetic result of an asymptomatic child will hold any immediate medical benefit, and for this reason, comprehensive genetic counseling with involvement from a child psychologist if practical is advocated. 7

Silent Gene Carriers Are Not Patients

Silent gene carriers have arisen as a direct result of more widespread genetic testing. As such, in the absence of a clinical phenotype, they should not be seen as patients but rather those under surveillance. Current American Heart Association guidelines suggest clinical follow-up of silent gene carriers every 3 to 5 years after the age of 18 to 21 years and, importantly, do not suggest restriction from competitive sports. 3 How we speak to the asymptomatic family members about the outcomes of cascade genetic testing, therefore, needs to reflect this. This also highlights the need for evidence-based guidelines around management of silent gene carriers, to elicit greater confidence from the clinicians managing the families, and to avoid overzealous and unnecessary clinical investigations.

Reasonable Response and Investigation of Uncertain Variants, Including Incidental Findings

If variants in genes not related to a phenotype of HCM are identified, then by definition they are uncertain. The interpretation of a genetic result in a clinical setting must be limited only to those genes with clear established association with the phenotype in question. Investigation of new genes must be confined to research studies and centers with significant cardiac genetic expertize. The identification of uncertain and incidental variants can be problematic, but just as important is the clinician’s flow-on decisions. Care should be taken to avoid a reflexive reaction of comprehensive clinical investigations and segregation studies in the family members where there is no clear benefit.

Pretest Genetic Counseling and Disease Expertize Makes for Informed Decision-Making

Comprehensive pretest genetic counseling is an essential and necessary step before genetic testing. 4,20 This means ensuring the participant is informed and aware of the possible outcomes, discussion of psychosocial considerations, and supporting an autonomous decision that is based on individual preferences and values. Another aspect of informed choice when considering cascade genetic testing is accurate and appropriate risk information, taking into account varying levels of health literacy across populations. Further understanding of the natural history and clinical outcomes for silent gene carriers will provide a more sound evidence base on which family members can make a truly informed decision about whether to pursue genetic testing. The experience and expertize of high-volume specialized multidisciplinary HCM centers is increasingly valuable in this setting. 32

Conclusions

Implementation of new technologies for early diagnosis, improving outcomes, and survival is commonplace in medicine. Foremost, the emergence of new genetic sequencing technologies has resulted in an exciting new phase in the field of cardiac genetic testing. In HCM, the potential for both benefits and harms is one that must be considered. Given the many considerations and potential outcomes of testing asymptomatic at-risk individuals, a balanced approach needs to be undertaken. This should consider the clinical situation, the patient needs, and the available genetic testing options, with the overall goal to contribute some improvement in clinical care.

Acknowledgments

Dr Ingles is the recipient of a National Health and Medical Research Council (NHMRC) and National Heart Foundation of Australia Early Career Fellowship (No 1036756). Dr Semsarian is the recipient of a NHMRC Practitioner Fellowship (No 571084).


Materials and Methods

Structural Phylogenomic Analysis

In this study we mapped the evolution of aaRS domains in a published evolutionary timeline of domain appearance at fold family (FF) level of structural abstraction [6], [7]. This timeline was selected for a number of reasons: FFs generally provide structures with unambiguous assignments of molecular functions, the timeline is well annotated, and results can be benchmarked to a description of the rise of early structures and functions [7]. The timeline was derived from a phylogenomic tree of 2,397 FF structures (out of 3,464 defined by the structural classification of proteins ( scop ) 1.73 [8]) reconstructed from a structural census in the genomes of 420 free-living organisms from all three cellular superkingdoms (FL420). The timeline was for all purposes congruent to a timeline derived from a phylogenomic tree of 3,513 FFs (out of 3,902 defined by scop 1.75) reconstructed from a census of 989 genomes (A989) [14]. In these structural genomic censuses, hmmscan of the profile HMMER3 package scans genomic sequences (with probability cutoffs E of 10 −4 ) against a library of advanced linear HMMs of structural recognition in superfamily [73]. We note that the phylogenomic approach based on structure (summarized in the flow diagram of Figure 1A) is impervious to a number of limitations that plague sequence analysis, such as problems of alignment, character independence, inapplicable characters, saturation and taxon sampling [74], and is even robust against uneven sampling of genomes across the three superkingdoms [75].

Despite robust evolutionary trends across phylogenies [5], the exact order of closely positioned FFs can be debatable in phylogenetic reconstructions of trees with thousands of leaves. For this reason, we sub-selected domains that were part of aaRSs and generated rooted trees describing the evolution of only the FFs associated with these enzymes (Figure S1, panel A). Tree reconstructions were carried out using maximum parsimony as optimality criterion and a combined parsimony ratchet as previously described [6], [14]. The trees were rooted by the Lundberg method, which does not impose a requirement of outgroup taxa. Phylogenetic reliability was evaluated by the nonparametric bootstrap method with 1,000 replicates, with resampling size being the same as the number of the genomes sampled, TBR, and maxtrees unrestricted. The structure of phylogenetic signal in the data was tested by the skewness (g1) of the length distribution of 1,000 random trees [76]. Tree distribution profiles and metrics of skewness indicated strong cladistics structure (p<0.01). Recovered trees were well resolved and had basal topologies that matched those of homologous subtrees in the published trees of FFs. Bootstrap support (BS) values for basal branches were 100 and ranged 56–83 in more derived branches the very derived regions were variable within the 9 most parsimonious trees that were retained. This indicates topologies provide strong support to phylogenetic statements, with support increasing towards the base of the tree. In a recent study, we also reconstructed trees of aaRS domains [77]. These trees were derived from a census of protein structure in 1,037 genomes that included organisms in the three superkingdoms and viruses. Again, topologies were remarkably consistent. Given these results, we used domain ages obtained from the global published phylogenies to place aaRSs in the timeline along with other domains linked to the ribosome and non-ribosomal protein synthetases (NRPS) that we used as reference (Figure 2). For simplicity, domains are here identified with concise classification strings (ccs). For example, the catalytic domain of tyrosyl-tRNA synthetase [EC 6.1.1.1] corresponds to the c.26.1.1 FF, in which c represents the protein class (α/β proteins), 26 the F (adenine nucleotide alpha hydrolase-like fold), 1 the FSF (nucleotidylyl transferase superfamily), and 1 the FF (Class I aaRSs, catalytic domain).

The relative age of protein structures (nd) was calculated directly from the rooted trees using a script that counts the number of nodes from the root (base) of the tree to each leaf and provides it in a relative zero-to-one scale. These nd values take advantage of the highly imbalanced nature of the trees of domain structures, as recently discussed [78]. Tree imbalance in these trees is a natural consequence of a heritable trait [79], the accumulation of domain structures in proteins and proteomes [13], which naturally poises speciation [80]. In fact, the δ-test [81] confirms that imbalance in trees of domains was not the result of the node density artifact and represents a true evolutionary process. Moreover, we find that trees do not follow random or Yule models of speciation, which can be considered to drive the evolution of species [13]. The nd values are also good proxy for geological time. The molecular clocks for protein domains at fold (F)(t = –3.802 ndF +3.814) and fold-superfamily (FSF)(t = –3.831 ndFSF +3.628)(11) levels were used to calculate the geological ages of selected families (in billions of years Gya), provided that FFs were the most ancient in each group. We note that extending the clock to FFs showed that domain age continued to be proportional to time but with larger dispersion at high ndFF values. Clocks were calibrated with geological ages derived from the study of fossils and geochemical, biochemical, and biomarker data, which are affected by the validity of the assumptions used in each and every one of the supporting studies [13]. We also note that the molecular clock derived from trees of Fs and FSFs is necessarily dependent on the rates of domain discovery and accumulation that could be deviant for some domain structures. These factors could cause departures from a clock, with overdispersion sometimes resulting from changes in foldability and structural stability of domains [63].

Phylogenetic Analysis of tRNA Structure

The ages of tRNA molecules here used were derived from published timelines of amino acid charging and encoding generated from trees of tRNA structure [28]. The method extracts phylogenetic signatures from structural topology in RNA [22], [23], [28], [29], [82]–[87]. These signatures are drawn from links between secondary structure and conformation, dynamics and adaptation [88]. Geometrical and statistical features of RNA substructures are scored in thousands of molecules and this information is analyzed with modern phylogenetic methods to produce trees of molecules and trees of substructures that portray the history of the system (molecules) or its component parts (substructures), respectively (Figure 1C). The phylogenetic model automatically roots the trees by assuming conformational stability increases in evolution as structures become canalized. The validity of polarization and rooting depends on the axiomatic component of character transformation, which is supported by considerable evidence and is also falsifiable [87].

Phylogenetic constraint analysis restricts the search for optimal trees of tRNAs to pre-specified topologies [29], [85] and can provide important insights from trees of molecules. Here, the minimum number of additional steps (S) required to force groups of tRNA taxa in trees (non-mutually exclusive hypotheses) defined measures of ancestrality of individual groups and were used to build evolutionary timelines of amino acid charging and amino acid encoding. Hypotheses with smaller S were considered less affected by recruitment and represented processes that were more ancient. Using this approach, chronologies of amino acid charging and codon discovery were directly derived for isoacceptor (Saac) and anticodon-specific (Scod) tRNAs, respectively. The validity of character argumentation and the assumption that groups that require lower number of steps are deemed more ancient was derived from the rooted trees and the model of character polarization [87].

Finding Coevolutionary Patterns

The ages of protein domain structures at FF level (ndFF) were plotted against the ages of isoacceptor tRNAs (Saac) and anticodon-specific tRNAs (Scod). We found that correlations were significant in all instances (P<0.0067). Regression lines unfolded evolutionary timelines of archaic editing functions and anticodon-binding specificities, once editing and other accessory domains were identified. Specifically, catalytic and editing domains appearing before anticodon-binding domains (ndFF <0.196) and editing domains for AlaRS, ThrRS and PheRS appearing after that age (during a period in which both the operational and standard genetic code were unfolding) were included in correlation studies. Regression timelines were also compared to a conservative idealized timeline that spans domain age and underweights tRNA evolution (relative rates of structural change in protein domains can be 4.3 times higher than in tRNA). Two groups of aaRSs appearing close together in the idealized timeline are part of well-established superclusters defined by sequence and structural analysis [89], superclusters of Class I LeuRS, MetRS, IleRS and ValRS and Class II SerRS and ProRS (Figure 2B). In this regard, higher Saac values of tRNAs interacting with structurally related ProRS, LysRS and MetRS makes them evolutionarily derived when compared to SerRS, a fact that is made explicit by the coevolutionary patterns. A recent sequence-based analysis of SerRS, ProRS and ThrRS shows indeed the late appearance of ThrRS [90]. This observation supports the timeline’s validity and the reuse of domain structures in evolution to unfold different specificities.

Genetic Code Complementarity and Identity Elements

Putative imprints in the primordial complementarity proposed to exist in the genetic code were borrowed directly from Rodin and Rodin [37]. Nucleotides were defined according to IUPAC-IUB Commission of Biochemical Nomenclature. Identity elements in tRNA were identified by in vitro and in vivo approaches [25]–[27]. Loss of aminoacylation efficiency (L) for identity elements in tRNA were given as L = (k/Km)wt/(k/Km)mutant, with rate constants being either kcat or Vmax [25]. Patterns of conservation in tRNA were derived from tRNAdb and identified in tRNAs using established nucleotide numbers [91].

Analysis of Protein Sequences and Structures

We analyzed amino acid frequencies in secondary structures for a non-culled and a culled set of protein entries from the Protein Data Bank (PDB) (Figure S1). Secondary structures were assigned using the DSSP program [92]. The non-culled set included 204,531 domain sequences (51,392,487 amino acids) downloaded from the PDB (June 20, 2012). The culled set of high quality PDB entries was selected using the protein sequence-culling server PISCES [93] with the following thresholds: sequence percentage identity, ≤25% resolution, 0.0–2.5 R-factor, 0.25 sequence length, 40–10,000 exclusion of non-X-ray and Cα-only entries culling by chain. The culled set included 6,828 sequences (1,654,074 amino acids). Since data set sizes are large (n >100), it is appropriate to use a parametric method (ANOVA) to test whether there is significant difference among the frequencies of the three amino acid groups (helix, turn and strand) in every secondary structure.

In order to calculate amino acid frequencies in different FFs, we assigned structures to PDB sequences with the HMMs from superfamily [73]. FF assignments were defined according to the SCOP database [8]. The identity of PDB sequences was set as ≤25% using PISCES. All PDB sequences with two or more FFs or unassigned ranges longer than 30 amino acids were eliminated from the study. In the final PDB sequence set, there were 2,384 sequences covered by 1,475 FFs.

We also examined whether or not each of the 20 amino acids or the 408 possible dipeptides that included 2 ambiguous amino acids, Z and X, was respectively enriched in FFs that were more ancient than the oldest anticodon-binding domain of the timeline (305 sequences appearing earlier than c.51.1.1, ndFF <0.2) when compared to the set of 2,384 sequences and 1,475 FFs appearing throughout the timeline (nd = 0–1). For each of the two sequence sets, we counted the numbers of multiple occurrences of amino acids or dipeptides and then calculated the probability of enrichment of every amino acid or dipeptide that was present in the ancient 305-sequence set using the hypergeometric distribution and the following equation [75]:

Observed values M and k indicate the numbers of multiple occurrences of examined amino acids or dipeptides in the 2,384-sequence and the 305-sequence set, respectively. The values N and n are the numbers of multiple occurrences of all amino acids or dipeptides in the two sequence sets, respectively. The probability P(X = k) implies the chance that a random variable X has k multiple occurrences of amino acids or dipeptides for a given amino acid or dipeptide. Referring to the equation and previous literature [94], we calculated P values for every amino acid or dipeptide that has k/n larger than M/N, and evaluated statistical significance of enrichment of each amino acid or dipeptide in the ancient FFs at 99% confidence levels (P<0.01).

Annotations of Molecular Functions

Molecular functions linked to FFs were annotated using the hierarchical classification of superfamily [73] that assigns seven general functional categories and 50 subcategories to scop IDs [8] based on information in scop , Interpro, Pfam, SwissProt, and literature sources (http://supfam.cs.bris.ac.uk/SUPERFAMILY/function.html). Domain architectures were queried in the PDB database [95] (http://www.rcsb.org/pdb/home/) and annotated using Gene Ontology (GO) [96] (http://www.geneontology.org). GO terms define a vocabulary of molecular functions, biological processes, and cellular components. We examined GO terms linked to molecular functions of translation. Manual annotations also involved queries in the UniProtKB ( protein knowledgebase ) database (http://www.uniprot.org/) and HMM-based structure assignments. Annotations were mapped onto the domain timeline.

Structural Alignments

Structural alignments of individual protein structural entries against structural sets were performed using DALI conservation mapping [97] or GANGSTA+ [98]. The DALI server performs pairwise comparisons to PDB90 based on a systematic branch-and-bound search that returns non-overlapping solutions in decreasing order of alignment Z-scores. Selected subsets of structural neighbors were visualized in multiple 3D superpositions for visualization of structural and sequence conservation. Alignments with GANGSTA+, an implementation that uses an advanced non-sequential alignment method with proper assignment of helices and strands in the structure, were against version 1.75 of the ASTRAL40 compendium.


Statistical Analysis

Population-level risks were estimated for groups of pathogenic alleles, genes, or loci with the use of odds ratios with significance thresholds (corrected for multiple testing) and 95% confidence intervals. 13,25 The odds ratios were converted to estimated population penetrance (equivalent to the population incidence or risk) with Bayes’ theorem, under the assumption of an incidence of 15 cases per 100,000 European-ancestry live births. 11 Allele frequencies among controls were obtained from a variety of public resources 17,21,26 to estimate the population attributable risk. Additional details are provided in the Methods section in the Supplementary Appendix.


References

Pare JA, Fraser RG, Pirozynski WJ, Shanks JA, Stubington D

Geisterfer-Lowrance AA, Kass S, Tanigawa G, Vosberg HP, McKenna W, Seidman CE, Seidman JG

Green EM, Wakimoto H, Anderson RL, Evanchik MJ, Gorham JM, Harrison BC, Henze M, Kawas R, Oslob JD, Rodriguez HM, et al.

Charron P, Dubourg O, Desnos M, Isnard R, Hagege A, Millaire A, Carrier L, Bonne G, Tesson F, Richard P, et al.

Maron BJ, Spirito P, Wesley Y, Arce J

Maron BJ, Gardin JM, Flack JM, Gidding SS, Kurosaki TT, Bild DE

Zou Y, Song L, Wang Z, Ma A, Liu T, Gu H, Lu S, Wu P, Zhang dagger Y, Shen dagger L, et al.

Bick AG, Flannick J, Ito K, Cheng S, Vasan RS, Parfenov MG, Herman DS, DePalma SR, Gupta N, Gabriel SB, et al.

Semsarian C, Ingles J, Maron MS, Maron BJ

Nijenkamp LLAM, Bollen IAE, Niessen HWM, Dos Remedios CG, Michels M, Poggesi C, Ho CY, Kuster DWD, van der Velden J

Eberly LA, Day SM, Ashley EA, Jacoby DL, Jefferies JL, Colan SD, Rossano JW, Semsarian C, Pereira AC, Olivotto I, et al.

Rowin EJ, Maron MS, Wells S, Patel PP, Koethe BC, Maron BJ

Lakdawala NK, Olivotto I, Day SM, Han L, Ashley EA, Michels M, Ingles J, Semsarian C, Jacoby D, Jefferies JL, et al.

Watkins H, McKenna WJ, Thierfelder L, Suk HJ, Anan R, O’Donoghue A, Spirito P, Matsumori A, Moravec CS, Seidman JG

Niimura H, Bachinski LL, Sangwatanaroj S, Watkins H, Chudley AE, McKenna W, Kristinsson A, Roberts R, Sole M, Maron BJ, et al.

Thierfelder L, Watkins H, MacRae C, Lamas R, McKenna W, Vosberg HP, Seidman JG, Seidman CE

Levy S, Sutton G, Ng PC, Feuk L, Halpern AL, Walenz BP, Axelrod N, Huang J, Kirkness EF, Denisov G, et al.

Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, He W, Chen YJ, Makhijani V, Roth GT, et al.

Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR

Kong A, Frigge ML, Masson G, Besenbacher S, Sulem P, Magnusson G, Gudjonsson SA, Sigurdsson A, Jonasdottir A, Jonasdottir A, et al.

Palamara PF, Francioli LC, Wilton PR, Genovese G, Gusev A, Finucane HK, Sankararaman S, Sunyaev SR, de Bakker PI, Wakeley J, et al.

Besenbacher S, Liu S, Izarzugaza JM, Grove J, Belling K, Bork-Jensen J, Huang S, Als TD, Li S, Yadav R, et al.

Campbell CD, Chong JX, Malig M, Ko A, Dumont BL, Han L, Vives L, O’Roak BJ, Sudmant PH, Shendure J, et al.

Tennessen JA, Bigham AW, O’Connor TD, Fu W, Kenny EE, Gravel S, McGee S, Do R, Liu X, Jun G, et al.

Daw EW, Lu Y, Marian AJ, Shete S

Daw EW, Chen SN, Czernuszewicz G, Lombardi R, Lu Y, Ma J, Roberts R, Shete S, Marian AJ

Alfares AA, Kelly MA, McDermott G, Funke BH, Lebo MS, Baxter SB, Shen J, McLaughlin HM, Clark EH, Babb LJ, et al.

Richard P, Charron P, Carrier L, Ledeuil C, Cheav T, Pichereau C, Benaiche A, Isnard R, Dubourg O, Burban M, et al.

Millat G, Bouvagnet P, Chevalier P, Dauphin C, Jouk PS, Da Costa A, Prieur F, Bresson JL, Faivre L, Eicher JC, et al.

Helms AS, Thompson AD, Glazier AA, Hafeez N, Kabani S, Rodriguez J, Yob JM, Woolcock H, Mazzarotto F, Lakdawala NK, et al.

Vignier N, Schlossarek S, Fraysse B, Mearini G, Krämer E, Pointu H, Mougenot N, Guiard J, Reimer R, Hohenberg H, et al.

Adalsteinsdottir B, Teekakirikul P, Maron BJ, Burke MA, Gudbjartsson DF, Holm H, Stefansson K, DePalma SR, Mazaika E, McDonough B, et al.

Calore C, De Bortoli M, Romualdi C, Lorenzon A, Angelini A, Basso C, Thiene G, Iliceto S, Rampazzo A, Melacini P

Kubo T, Kitaoka H, Okawa M, Matsumura Y, Hitomi N, Yamasaki N, Furuno T, Takata J, Nishinaga M, Kimura A, et al.

Dhandapany PS, Sadayappan S, Xue Y, Powell GT, Rani DS, Nallari P, Rai TS, Khullar M, Soares P, Bahl A, et al.

Harper AR, Bowman M, Hayesmoore JBG, Sage H, Salatino S, Blair E, Campbell C, Currie B, Goel A, McGuire K, et al.

Poetter K, Jiang H, Hassanzadeh S, Master SR, Chang A, Dalakas MC, Rayment I, Sellers JR, Fananapazir L, Epstein ND

Satoh M, Takahashi M, Sakamoto T, Hiroe M, Marumo F, Kimura A

Carniel E, Taylor MR, Sinagra G, Di Lenarda A, Ku L, Fain PR, Boucek MM, Cavanaugh J, Miocic S, Slavov D, et al.

Rubattu S, Bozzao C, Pennacchini E, Pagannone E, Musumeci BM, Piane M, Germani A, Savio C, Francia P, Volpe M, et al.

Mogensen J, Klausen IC, Pedersen AK, Egeblad H, Bross P, Kruse TA, Gregersen N, Hansen PS, Baandrup U, Borglum AD

Osio A, Tan L, Chen SN, Lombardi R, Nagueh SF, Shete S, Roberts R, Willerson JT, Marian AJ

Ruggiero A, Chen SN, Lombardi R, Rodriguez G, Marian AJ

Chiu C, Bagnall RD, Ingles J, Yeates L, Kennerson M, Donald JA, Jormakka M, Lind JM, Semsarian C

Prondzynski M, Lemoine MD, Zech AT, Horváth A, Di Mauro V, Koivumäki JT, Kresin N, Busch J, Krause T, Krämer E, et al.

Geier C, Gehmlich K, Ehler E, Hassfeld S, Perrot A, Hayess K, Cardim N, Wenzel K, Erdmann B, Krackhardt F, et al.

Hayashi T, Arimura T, Itoh-Satoh M, Ueda K, Hohda S, Inagaki N, Takahashi M, Hori H, Yasunami M, Nishi H, et al.

Friedrich FW, Wilding BR, Reischmann S, Crocini C, Lang P, Charron P, Müller OJ, McGrath MJ, Vollert I, Hansen A, et al.

Christodoulou DC, Wakimoto H, Onoue K, Eminaga S, Gorham JM, DePalma SR, Herman DS, Teekakirikul P, Conner DA, McKean DM, et al.

Lim DS, Roberts R, Marian AJ

Arimura T, Matsumoto Y, Okazaki O, Hayashi T, Takahashi M, Inagaki N, Hinohara K, Ashizawa N, Yano K, Kimura A

Chen SN, Czernuszewicz G, Tan Y, Lombardi R, Jin J, Willerson JT, Marian AJ

Salazar-Mendiguchía J, Ochoa JP, Palomino-Doza J, Domínguez F, Díez-López C, Akhtar M, Ramiro-León S, Clemente MM, Pérez-Cejas A, Robledo M, et al.

Auxerre-Plantie E, Nielsen T, Grunert M, Olejniczak O, Perrot A, Özcelik C, Harries D, Matinmehr F, Remedios CD, Mühlfeld C, et al.

Matsushita Y, Furukawa T, Kasanuki H, Nishibatake M, Kurihara Y, Ikeda A, Kamatani N, Takeshima H, Matsuoka R

Landstrom AP, Adekola BA, Bos JM, Ommen SR, Ackerman MJ

Chiu C, Tebo M, Ingles J, Yeates L, Arthur JW, Lind JM, Semsarian C

Valdés-Mas R, Gutiérrez-Fernández A, Gómez J, Coto E, Astudillo A, Puente DA, Reguero JR, Álvarez V, Morís C, León D, et al.

Gomez J, Lorca R, Reguero JR, Morís C, Martín M, Tranche S, Alonso B, Iglesias S, Alvarez V, Díaz-Molina B, et al.

Almomani R, Verhagen JM, Herkert JC, Brosens E, van Spaendonck-Zwarts KY, Asimaki A, van der Zwaag PA, Frohn-Mulder IM, Bertoli-Avella AM, Boven LG, et al.

Li L, Bainbridge MN, Tan Y, Willerson JT, Marian AJ

Hayashi T, Arimura T, Ueda K, Shibata H, Hohda S, Takahashi M, Hori H, Koga Y, Oka N, Imaizumi T, et al.

Saltzman AJ, Mancini-DiNardo D, Li C, Chung WK, Ho CY, Hurst S, Wynn J, Care M, Hamilton RM, Seidman GW, et al.

Page SP, Kounas S, Syrris P, Christiansen M, Frank-Hansen R, Andersen PS, Elliott PM, McKenna WJ

Jääskeläinen P, Heliö T, Aalto-Setälä K, Kaartinen M, Ilveskoski E, Hämäläinen L, Melin J, Kärkkäinen S, Peuhkurinen K, Nieminen MS, et al.

Jääskeläinen P, Heliö T, Aalto-Setälä K, Kaartinen M, Ilveskoski E, Hämäläinen L, Melin J, Nieminen MS, Laakso M, Kuusisto J, et al.

Alders M, Jongbloed R, Deelen W, van den Wijngaard A, Doevendans P, Ten Cate F, Regitz-Zagrosek V, Vosberg HP, van Langen I, Wilde A, et al.

Blair E, Price SJ, Baty CJ, Ostman-Smith I, Watkins H

Girolami F, Ho CY, Semsarian C, Baldi M, Will ML, Baldini K, Torricelli F, Yeates L, Cecchi F, Ackerman MJ, et al.

Nagueh SF, McFalls J, Meyer D, Hill R, Zoghbi WA, Tam JW, Quiñones MA, Roberts R, Marian AJ

Nagueh SF, Bachinski LL, Meyer D, Hill R, Zoghbi WA, Tam JW, Quiñones MA, Roberts R, Marian AJ

Ameur A, Kloosterman WP, Hestand MS

Rottbauer W, Gautel M, Zehelein J, Labeit S, Franz WM, Fischer C, Vollrath B, Mall G, Dietz R, Kübler W, et al.

Marston S, Copeland O, Jacques A, Livesey K, Tsang V, McKenna WJ, Jalilzadeh S, Carballo S, Redwood C, Watkins H

Tripathi S, Schultz I, Becker E, Montag J, Borchert B, Francino A, Navarro-Lopez F, Perrot A, Özcelik C, Osterziel KJ, et al.

Kraft T, Montag J, Radocaj A, Brenner B

Frischmeyer PA, van Hoof A, O’Donnell K, Guerrerio AL, Parker R, Dietz HC

Siwaszek A, Ukleja M, Dziembowski A

Sarikas A, Carrier L, Schenke C, Doll D, Flavigny J, Lindenberg KS, Eschenhagen T, Zolk O

Schlossarek S, Frey N, Carrier L

Stewart MA, Franks-Skiba K, Chen S, Cooke R

Anderson RL, Trivedi DV, Sarkar SS, Henze M, Ma W, Gong H, Rogers CS, Gorham JM, Wong FL, Morck MM, et al.

McNamara JW, Li A, Smith NJ, Lal S, Graham RM, Kooiker KB, van Dijk SJ, Remedios CGD, Harris SP, Cooke R

Witjas-Paalberends ER, Piroddi N, Stam K, van Dijk SJ, Oliviera VS, Ferrara C, Scellini B, Hazebroek M, ten Cate FJ, van Slegtenhorst M, et al.

Witjas-Paalberends ER, Güçlü A, Germans T, Knaapen P, Harms HJ, Vermeer AM, Christiaans I, Wilde AA, Dos Remedios C, Lammertsma AA, et al.

Bloemink M, Deacon J, Langer S, Vera C, Combs A, Leinwand L, Geeves MA

Lombardi R, Bell A, Senthil V, Sidhu J, Noseda M, Roberts R, Marian AJ

Nagueh SF, Chen S, Patel R, Tsybouleva N, Lutucuta S, Kopelen HA, Zoghbi WA, Quiñones MA, Roberts R, Marian AJ

Solaro RJ, Varghese J, Marian AJ, Chandra M

Crilley JG, Boehm EA, Blair E, Rajagopalan B, Blamire AM, Styles P, McKenna WJ, Ostman-Smith I, Clarke K, Watkins H

Kraft T, Witjas-Paalberends ER, Boontje NM, Tripathi S, Brandis A, Montag J, Hodgkinson JL, Francino A, Navarro-Lopez F, Brenner B, et al.

Gupte TM, Haque F, Gangadharan B, Sunitha MS, Mukherjee S, Anandhan S, Rani DS, Mukundan N, Jambekar A, Thangaraj K, et al.

Cordero-Reyes AM, Youker K, Hamilton DJ, Torre-Amione G, Marian AJ, Nagueh SF

Litviňuková M, Talavera-López C, Maatz H, Reichart D, Worth CL, Lindberg EL, Kanda M, Polanski K, Heinig M, Lee M, et al.

Fraysse B, Weinberger F, Bardswell SC, Cuello F, Vignier N, Geertz B, Starbatty J, Krämer E, Coirault C, Eschenhagen T, et al.

Sequeira V, Wijnker PJ, Nijenkamp LL, Kuster DW, Najafi A, Witjas-Paalberends ER, Regan JA, Boontje N, Ten Cate FJ, Germans T, et al.

O’Hanlon R, Grasso A, Roughton M, Moon JC, Clark S, Wage R, Webb J, Kulkarni M, Dawson D, Sulaibeekh L, et al.

Schelbert EB, Piehler KM, Zareba KM, Moon JC, Ugander M, Messroghli DR, Valeti US, Chang C-CH, Shroff SG, Diez J, et al.

Helms AS, Alvarado FJ, Yob J, Tang VT, Pagani F, Russell MW, Valdivia HH, Day SM

Teekakirikul P, Eminaga S, Toka O, Alcalai R, Wang L, Wakimoto H, Nayor M, Konno T, Gorham JM, Wolf CM, et al.

Senthil V, Chen SN, Tsybouleva N, Halder T, Nagueh SF, Willerson JT, Roberts R, Marian AJ

Patel R, Nagueh SF, Tsybouleva N, Abdellatif M, Lutucuta S, Kopelen HA, Quinones MA, Zoghbi WA, Entman ML, Roberts R, et al.

Lim DS, Lutucuta S, Bachireddy P, Youker K, Evans A, Entman M, Roberts R, Marian AJ

Nagueh SF, Kopelen HA, Lim DS, Zoghbi WA, Quiñones MA, Roberts R, Marian AJ

Kramer CM, Reichek N, Ferrari VA, Theobald T, Dawson J, Axel L

Urbano-Moral JA, Rowin EJ, Maron MS, Crean A, Pandian NG

Monserrat L, Elliott PM, Gimeno JR, Sharma S, Penas-Lado M, McKenna WJ

O’Mahony C, Jichi F, Pavlou M, Monserrat L, Anastasakis A, Rapezzi C, Biagini E, Gimeno JR, Limongelli G, McKenna WJ, et al.

Olivotto I, Cecchi F, Casey SA, Dolara A, Traverse JH, Maron BJ

Siontis KC, Geske JB, Ong K, Nishimura RA, Ommen SR, Gersh BJ

Debonnaire P, Joyce E, Hiemstra Y, Mertens BJ, Atsma DE, Schalij MJ, Bax JJ, Delgado V, Marsan NA

Marian AJ, Asatryan B, Wehrens XHT

Eriksson MJ, Sonnenberg B, Woo A, Rakowski P, Parker TG, Wigle ED, Rakowski H

Klarich KW, Attenhofer Jost CH, Binder J, Connolly HM, Scott CG, Freeman WK, Ackerman MJ, Nishimura RA, Tajik AJ, Ommen SR

Kubo T, Ochi Y, Baba Y, Sugiura K, Takahashi A, Hirota T, Yamanaka S, Yamasaki N, Doi YL, Kitaoka H

Bagnall RD, Weintraub RG, Ingles J, Duflou J, Yeates L, Lam L, Davis AM, Thompson T, Connell V, Wallace J, et al.

Maron BJ, Doerer JJ, Haas TS, Tierney DM, Mueller FO

Sherrid MV, Barac I, McKenna WJ, Elliott PM, Dickie S, Chojnowska L, Casey S, Maron BJ

Maron BJ, Spirito P, Ackerman MJ, Casey SA, Semsarian C, Estes NA, Shannon KM, Ashley EA, Day SM, Pacileo G, et al.

Nishimura RA, Seggewiss H, Schaff HV

Olivotto I, Oreziak A, Barriales-Villa R, Abraham TP, Masri A, Garcia-Pavia P, Saberi S, Lakdawala NK, Wheeler MT, Owens A, et al.

Ho CY, Mealiffe ME, Bach RG, Bhattacharya M, Choudhury L, Edelberg JM, Hegde SM, Jacoby D, Lakdawala NK, Lester SJ, et al.

Tsybouleva N, Zhang L, Chen S, Patel R, Lutucuta S, Nemoto S, DeFreitas G, Entman M, Carabello BA, Roberts R, et al.

Marian AJ, Senthil V, Chen SN, Lombardi R

Shimada YJ, Passeri JJ, Baggish AL, O’Callaghan C, Lowry PA, Yannekis G, Abbara S, Ghoshhajra BB, Rothman RD, Ho CY, et al.

Nagueh SF, Lombardi R, Tan Y, Wang J, Willerson JT, Marian AJ

Maron MS, Chan RH, Kapur NK, Jaffe IZ, McGraw AP, Kerur R, Maron BJ, Udelson JE

Marian AJ, Tan Y, Li L, Chang J, Syrris P, Hessabi M, Rahbar MH, Willerson JT, Cheong BY, Liu CY, et al.

Olivotto I, Camici PG, Merlini PA, Rapezzi C, Patten M, Climent V, Sinagra G, Tomberli B, Marin F, Ehlermann P, et al.

Ho CY, Lakdawala NK, Cirino AL, Lipshultz SE, Sparks E, Abbasi SA, Kwong RY, Antman EM, Semsarian C, González A, et al.


2 MATERIALS AND METHODS

2.1 Patients

The index patient (Figure 1, individual I:2) was selected from the individuals followed at the University Medical Centre Ljubljana (UMC) over an 8-year period according to the diagnostic algorithm for erythrocytosis. 17 The inclusion criteria were as follows: (a) haemoglobin and/or haematocrit above reference values at least twice over 2 months (b) absence of variants JAK2 p.Val617Phe and JAK2 exon 12 (c) absence of any defined cause of secondary acquired erythrocytosis and (d) absence of variants in genes for the thrombopoietin receptor (MPL), calreticulin (CALR) and receptor tyrosine kinase (KIT). 17 The index patient was a 68-year-old male who was referred to the Clinical Department of Haematology at UMC Ljubljana in 2013, due to his high RBC count of 6.36 x 10 12 cells/L, high haemoglobin of 221 g/L and high haematocrit of 0.650. His EPO level was in the normal range (11.4 IU/L). The patient also had elevated blood pressure and heart rate. The patient was prescribed for therapeutic phlebotomy, and at the last follow-up in 2019, he had a slightly increased haemoglobin of 171 g/L. Upon medical history revision, he reported that his son also had clinical signs of erythrocytosis.

The son (Figure 1, individual II:1) of the index patient was 41 years old, and at his first appointment in 2015, he presented with a RBC count of 6.29 × 10 12 cells/L, high haemoglobin of 192 g/L and high haematocrit of 0.552, while his EPO level was normal (8.3 IU/L). The son was a regular blood donor, and so he was not referred for a phlebotomy. These two patients were tested for polycythaemia vera and hereditary haemochromatosis and were negative for variants p.Val617Phe and exon 12 in the JAK2 gene, and also for variants p.Cys282Tyr, p.His63Asp and p.Ser65Asp in the gene for homostatic iron regulator (HFE). Neither of these two patients showed any pulmonary, cardiac or renal abnormalities.

The wife and mother of these elder and younger patients, respectively (Figure 1, individual I:1), is deceased, but had had normal RBC counts, haemoglobin levels and haematocrit, and no signs of erythrocytosis.

All three affected and healthy family members were, together with a reference DNA control NA12878, included in the NGS genetic testing for congenital erythrocytosis. The study was approved by the Slovenian Ethical Committee, No. KME 115/07/15.

2.2 Genetic analysis

Patient's peripheral blood was collected for genetic analysis, together with written informed consent signed by all patients. Granulocytes were isolated from collected peripheral blood, and the genomic DNA was extracted from 1 to 2 × 10 7 cells using QIAamp DNA mini kits (Qiagen). The patients underwent targeted NGS, with a custom gene panel that covered the target regions of 39 selected genes: 21 genes previously associated with congenital erythrocytosis and polycythaemia vera, 12 three additional erythrocytosis-associated genes (PKLR, TET2 and GATA) and 15 genes previously associated with hereditary haemochromatosis. 18 All of the selected genes were targeted for exon regions, while EPO, VHL and some of the other genes had the first intron, promoter and enhancer regions also included 19, 20 (Table 2).

Association Exon Intron 1 Promotor and enhancer
Erythrocytosis-associated genes a a Adopted from Camps et al. (2016). 12
BHLHE41, BPGM, EGLN1, EGLN2, EGLN3, EPAS1, EPO, EPOR, GFI1B, HBA1, HBA2, HBB, HIF1A, HIF1AN, HIF3A, JAK2, KDM6A, OS9, PKLR, SH2B3, VHL, ZNF197, GATA1 and TET2 VHL, EPO, EPOR, HBB, HBA1, HBA2 EPO
Hereditary haemochromatosis-associated genes b b Adopted form Lanktree et al. (2017). 18
HFE, HJV, HAMP, TFR2, SLC40A1, FTH1, TF, B2 M, CP, FTL, CDAN1, SEC23B, SLC25A38, STEAP3 and ALAS2 - -
  • BHLHE41 gene for basic helix-loop-helix e41, BPGM gene for bisphosphoglycerate mutase, EGLN1 gene for egl 1, EGLN2 gene for egl, EGLN3 gene for egl 3, EPAS1 gene for endothelial PAS domain protein 1, EPO gene for erythropoietin, EPOR gene for erythropoietin receptor, GFI1B gene for growth factor-independent 1B transcriptional repressor, HBA1 gene for haemoglobin subunit alpha 1, HBA2 gene for haemoglobin subunit alpha 2, HBB gene for haemoglobin subunit beta, HIF1A gene for hypoxia-inducible factor 1 subunit alpha, HIF1AN gene for hypoxia-inducible factor 1 subunit alpha inhibitor, HIF3A gene for hypoxia-inducible factor 3 subunit alpha, JAK2 gene for Janus kinase 2, KDM6A gene for lysine demethylase 6A, OS9 gene for OS9 PKLR gene for pyruvate kinase L/R, SH2B3 gene for SH2B adaptor protein 3, VHL gene for von Hippel-Lindau tumor suppressor, ZNF197 gene for zinc finger protein 197, GATA1 gene for GATA binding protein 1, TET2 gene for tet methylcytosine dioxygenase 2, HFE gene for homeostatic iron regulator, HJV gene for hemojuvelin BMP co-receptor, HAMP gene for hepcidin antimicrobial peptide, TFR2 gene for transferrin receptor 2, SLC40A1 gene for solute carrier family 40 member 1, FTH1 gene for ferritin heavy chain 1, TF gene for transferrin, B2M gene for beta-2-microglobulin, CP gene for ceruloplasmin, FTL gene for ferritin light chain, CDAN1 gene for codanin 1, SEC23B SEC23 homolog B, SLC25A38 gene for solute carrier family 25 member 38, STEAP3 STEAP3 metalloreductase, ALAS2 gene for 5’-aminolevulinate synthase 2.
  • a Adopted from Camps et al. (2016). 12
  • b Adopted form Lanktree et al. (2017). 18

Libraries were prepared using Nextera DNA library preparation kits (Illumina), with enrichment performed by probe hybridization approach using custom gene panel (Integrated DNA Technologies) followed by sequencing (MiniSeq, Illumina). The disease-risk variants identified were validated by Sanger sequencing (GATC Biotech). Sanger sequencing and prior PCR amplification were performed with custom-designed primers (Integrated DNA Technologies available upon request).

2.3 Bioinformatics analysis

The sequencing analysis was performed with built-in bioinformatics tools (Illumina) and variant annotation with an online tool (Variant Interpreter Illumina). The sequences were aligned to reference genome hg19 (GRCh37). To remove variants with low sequencing quality, the following filters were used within the variant caller: genotype quality (ie GQX value) <30 or not present quality by depth <2 root mean square mapping quality <20 strand bias > −10 and read depth <1. The variants were first selected based on the relationship between genotypes and phenotypes this selection was for variants identified as heterozygous in the affected family members and not in the healthy family members, or identified as homozygous in the affected family members and as heterozygous in the healthy family members. For the final selection, the variants with minor allele frequencies (MAFs) <0.05 in the European population were filtered, as MAF <0.05 is distinct for low-frequency variants. 21 We selected data from a European Non-Finnish population managed by the GnomAD genome and GnomAD exome databases, and data from a European population managed by the 1000 Genomes database. Before filtering, the presence of high-frequency variants in the JAK2 and HFE genes that cause polycythaemia vera and hereditary haemochromatosis was also assessed. The focus of our analysis was on small variants that involved one or a few nucleotides, such as single nucleotide variants (SNVs) and small insertions/deletions (INDELs).

With the aim to determine the degree of variant position conservation during evolution, conservation analysis was performed with the ConSurf server (https://consurf.tau.ac.il/). 22 The ConSurf server predicts the evolutionary conservation of amino acids or nucleotides based on multiple alignment and phylogenetic relations of homologous sequences that result in position-specific conservation scores. Continuous conservation scores are divided into scale of nine colour grades for visualization, from the most variable position (grade 1), through intermediately conserved position (grade 5), to the most conserved position (grade 9). To each colour grade, a confidence interval is assigned to each conservation score. If the interval spans four or more colour grades, less than six homologous sequences are aligned and the conservation score is unreliable. 22 The parameters for the conservation analysis of protein sequences were as follows: CSI-BLAST search algorithm 3 iterations E-value cut-off 0.0001 protein sequence database UNIREF90 number of analysed homologues 150 minimal 35% identity between homologues maximal 95% identity between homologues multiple sequence alignment algorithm ClustalW and method for calculation of the evolution rate Bayesian paradigm. For conservation analysis of nucleotide sequences, we manually searched for homologous sequences using Basic Local Alignment Search Tool (BLAST) program BLASTN (https://blast.ncbi.nlm.nih.gov/Blast.cgi) 23 and perform multiple sequence alignment with ClustalW (https://www.ebi.ac.uk/Tools/msa/clustalo/). 24 For calculation of evolution rate in ConSurf server, we selected Bayesian paradigm.

The pathogenicity of variants in coding regions was assessed using in silico prediction tools CADD score (https://cadd.gs.washington.edu/snv), 25 PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), 26 SIFT (https://sift.bii.a-star.edu.sg/), 27 MutPred2 (http://mutpred.mutdb.org/index.html), 28 SNPs&GO (https://snps.biofold.org/snps-and-go/snps-and-go.html), 29 PANTHER (http://www.pantherdb.org/tools/csnpScoreForm.jsp), 30 PhD-SNP (https://snps.biofold.org/phd-snp/phd-snp.html), 31 PROVEAN (http://provean.jcvi.org/index.php) 32 and Mutation Taster 2 (http://www.mutationtaster.org/) 33 as described by Schiemann and Stowell, 2016, 34 Bris et al. 2018, 35 Wang et al. 2020. 36 The pathogenicity prediction of intron variants and impact on splicing features were analysed using tools CADD score, 25 the RegSNP-intron (https://regsnps-intron.ccbb.iupui.edu/), Human Splicing Finder (http://www.umd.be/HSF/), 37 IntSplice (https://www.med.nagoya-u.ac.jp/neurogenetics/IntSplice/index.html) 38 as reviewed in Ohno et al. 2018 39 and Lin et. al. 2019. 40

With the CADD (Combined Annotation Dependent Depletion) tool, the prediction of the deleteriousness of variants results in ‘raw’ score and ‘PHRED-scaled’ score. A PHRED-scaled score expresses the rank of variant pathogenicity for example, a score of 10 indicates that the variant is predicted to be in the 10% of the most deleterious substitutions and a score of 20 between the 1% most deleterious. The suggested threshold is between 10 and 20, and we set the cut-off score at >15. 25 The PolyPhen-2 (Polymorphism Phenotyping vs.-2.0) predicts the effect of an amino acid change with classifiers ‘benign’, ‘possibly damaging’ and ‘probably damaging’ and scores from 0.0 to 1.0: variants with values closer to 0.0 are classified as benign and variants with values closer to 1.0 are more confidently predicted as probably damaging. Variants with score over 0.50 were predicted to be pathogenic. In bioinformatics analysis with PolyPhen-2, we present values obtained with HumVar prediction model, which is preferred model for diagnostics of Mendelian diseases. 26 The prediction tool SIFT (Sorting Intolerant From Tolerant) categorizes impact of amino acid change on protein function based on SIFT score ranging from 0 to 1: <0.05 is classified as damaging and >0.05 as tolerated. 27 The output of MutPred2 (Mutation prediction vs.-2.0) consists of a general score that ranges between 0.0 and 1.0, with a higher score indicating greater probability to be pathogenic. The cut-off general score >0.5 was considered to be pathogenic. 28 The prediction tool SNPs&GO uses the reliability index (RI) to evaluate how reliable is the prediction, with 0 being the most unreliable and 10 being the most reliable. Variants are predicted as neutral polymorphism (neutral) or disease related (disease) when the probability score is >0.5. 29 PANTHER (Protein Analysis Through Evolutionary Relationships) estimates the probability of a variant to impact protein function based on evolutionary preservation of a position in protein. The likelihood of deleterious effect increases with the longer preservation time. The thresholds values are >450 million years (my) for ‘probably damaging’ prediction, between 200 my and 450 my for ‘possibly damaging’ and <200 my for ‘probably benign’ prediction. 30 The prediction tool PhD-SNP (Predictor of Human Deleterious SNP) classifies variants into ‘neutral polymorphism’ or ‘disease related’ with values from 0 to 1 and the decision threshold for disease causing is >0.5. It also uses reliability index (RI) ranges from 0 to 10, from the most unreliable to the most reliable prediction. Prediction was used with 20-fold cross-validation. 31 PROVEAN (Protein Variant Effect Analyzer) uses PROVEAN score for binary classification of variants into either deleterious or neutral. The threshold is set to > −2.5 for neutral predictions and < −2.5 for deleterious predictions. 32 The prediction tool Mutation Taster 2 employs a Bayes classifier to predict a variant as one of the four possible types: ‘disease causing’, that is probably deleterious, ‘disease-causing automatic’, that is known to be deleterious, ‘polymorphism’, that is probably harmless and ‘polymorphism automatic’, that is known to be harmless. It also uses a probability values of the predictions ranging between 0 and 1, with values closer to 1 as more reliable prediction. 33 The RegSNP-intron is a tool that predicts the pathogenicity of intron variants with probability scores 0.00 −0.36 for benign variants 0.36–0.45 for possibly damaging variants and 0.45–1.00 for damaging variants. 40 The HSF (Human Splicing Finder) allows the identification of all splicing features through multiple algorithms and also the prediction of the impact of intron variants on these features. 37 IntSplice is a tool that predicts a splicing consequence of intron variant close to the 3’ end of an intron, and the result is either an abnormal or normal splicing. 38

The variants were named in agreement with the standard international nomenclature guidelines of the Human Genome Variation Society. 41


Putting it all Together: Toward a New Definition of the “Gene”

Where do all these considerations leave us? It took approximately half a century to go from Johannsen’s wholly abstract formulation of the term “gene” as a “unit of heredity,” to reach the early 1960s concept of the gene as a continuous segment of DNA sequence specifying a polypeptide chain. A further half century’s worth of experimental investigation has brought us to the realization that the 1960s definition is no longer adequate as a general one. Yet the term “gene” persists as a vaguely understood generic description. It is, to say the least, an anomalous situation that the central term of genetics should now be shrouded in confusion and ambiguity. That is not only intellectually unsatisfactory for the discipline, but has detrimental effects on the popular understanding of genetics. Such misunderstanding is seen most starkly in the situation noted earlier, the commonly held view that there are individual genes responsible “for” certain complex conditions, e.g., schizophrenia, alcoholism, etc. A clearer definition of the term would thus help both the field of genetics, and, ultimately, public understanding.

Here, therefore, we will propose a definition that we believe comes closer to doing justice to the idea of the “gene,” in light of current knowledge. It makes no reference to “the unit of heredity”—the long-standing sense of the term—because we feel that it is now clear that no such generic universal unit exists. By referring to DNA sequences, however, our definition embodies the hereditary dimension of genes (in a way that pure “process”-centered definitions focused on gene expression do not). Furthermore, in its emphasis on the ultimate molecular products and reference to GRNs as both evokers and mediators of the actions of those products, it recognizes the long causal chains that often operate between genes and their effects. Our provisional definition is this:

A gene is a DNA sequence (whose component segments do not necessarily need to be physically contiguous) that specifies one or more sequence-related RNAs/proteins that are both evoked by GRNs and participate as elements in GRNs, often with indirect effects, or as outputs of GRNs, the latter yielding more direct phenotypic effects.

This is an explicitly “molecular” definition, but we think that is what is needed now. In contrast, “genes” that are identified purely by their phenotypic effects, as for example in genome-wide association study (GWAS) experiments, would, in our view, not deserve such a characterization until found to specify one or more RNAs/proteins. The genetic effects picked up in such work often identify purely regulatory elements, and these should not qualify as genes, only as part of genes. Our definition, like the classic 1960s’ formulation, makes identifying the product(s) crucial to delimiting, hence identifying, the genes themselves. It, however, also emphasizes the molecular and cellular context in which those products form and function. Those larger contexts, in effect, become necessary to define the function of the specifying gene(s).

The new definition, however, is slightly cumbersome. We therefore offer it only as a tentative solution, hence as a challenge to the field to find a better formulation but one that does justice to the complex realities of the genetic material uncovered in the past half-century.

Of fundamental importance in the operational definition of the gene is the cis-trans test ( Lewis 1951 Benzer 1957). To test whether mutations a and b belong to the same gene or cistron ( Benzer 1957), or different cistrons, the cis-heterozygote a b/+ + and the trans-heterozygote a +/+ b are compared. If the cis-heterozygotes, and the trans-heterozygotes are phenotypically similar (usually wild type), they are said to “complement” one another, and the mutations are inferred to fall into different cistrons. If, however, the cis-heterozygotes and the trans-heterozygotes are phenotypically different, the trans-heterozygote being (usually) mutant, and the cis-heterozygote (usually) of wild type, the mutations do not complement, and are inferred to belong to the same cistron. The attached figure clarifies the idea.

The principle of the cis-trans test. If mutations a and b belong to the same cistron, the phenotypes of the cis- and trans-heterozygotes are different. If, however, the cis- and trans-heterozygotes are phenotypically similar, the mutations a and b belong to different cistrons. The notation “works” on the Figure means that the cistron is able to produce a functional polypeptide. Mutations a and b are recessive mutations that both affect the same phenotypic trait, such as the eye color of D. melanogaster, for example.

Genetic background effects typically exhibit either of two forms, when a pre-existing mutation, with an associated phenotypic manifestation, is crossed into a different strain: the reduction (“suppression”) of the mutant phenotype or its increase (“enhancement”). The effects involve either changes in the degree (“expressivity”) of the mutant effect, or the number of individuals) affected (its “penetrance”), or both. When analyzed genetically, these effects could often be traced to specific “suppressor” or “enhancer” loci, which could be either tightly linked or distant in the genome from the original mutant locus. Typically regarded as an unnecessary complication in analysis of the original mutation, they were usually not pursued further. Yet, in terms of current understanding of GRNs, they are not, in principle, mysterious. Each gene that is part of a GRN can be thought of as either transmitting a signal for the activation or repression of one or more other “downstream” genes in that network, but, given the hierarchical nature of GRNs, it follows that a mutational alteration in a specific gene in the network can be either strengthened or reduced by other mutational changes in the network, either upstream or downstream of the original mutation. The particular effect achieved will depend on the characteristics of each of the two mutations involved—whether they are loss-of- or gain-of-function mutations—and the precise nature of their connectivity. Such effects are most readily illustrated with linear sequences of gene actions, genetic pathways ( Wilkins 2007), but can be understood in networks, when the network structure and the placement of the two genes within them is known. Some genetic background effects, in principal, however, might involve partially redundant networks, in which the effects of the two pathways are additive. In those cases, a mutant effect in one pathway may be either compensated, hence suppressed, or exacerbated, by a second mutation in the other pathway, the precise effects again depending upon the specific characteristics of the mutations and the degree of redundancy between the two GRNs.


Watch the video: Developing Clinical and Molecular Genetics module using research experience (December 2022).