Microbial communities and their predicted metabolic 1 characteristics in deep fracture groundwaters of the crystalline 2 bedrock at Olkiluoto , Finland

10 The microbial diversity in oligotrophic isolated crystalline Fennoscandian Shield bedrock 11 fracture groundwaters is great but the core community has not been identified. Here we 12 characterized the bacterial and archaeal communities in 12 water conductive fractures situated 13 at depths between 296 m and 798 m by high throughput amplicon sequencing using the 14 Illumina HiSeq platform. Between 1.7 × 10 – 1.2 × 10 bacterial or archaeal sequence reads 15 per sample was obtained. These sequences revealed that up to 95% and 99% of the bacterial 16 and archaeal sequences obtained, respectively, belonged to only a few common species, i.e. 17 the core microbiome. However, the remaining rare microbiome contained over 3 and 6 fold 18 more bacterial and archaeal taxa. The metabolic properties of the microbial communities were 19 predicted using PICRUSt. The approximate estimation showed that the metabolic pathways 20 included commonly fermentation, fatty acid oxidation, glycolysis/gluconeogenesis, oxidative 21 phosphorylation and methanogenesis/anaerobic methane oxidation, but carbon fixation 22 through the Calvin cycle, reductive TCA cycle and the Wood-Ljungdahl pathway was also 23 predicted. The rare microbiome is an unlimited source of genomic functionality in all 24 ecosystems. It may consist of remnants of microbial communities prevailing in earlier 25 environmental conditions, but could also be induced again if changes in their living conditions 26 occur. 27


Introduction
Identifying and understanding the core microbiome of any given environment is of crucial importance for predicting and assessing environmental change both locally and globally (Shade and Handelsman, 2012).In a previous study (Bomberg et al., 2015) we showed by 454 amplicon sequencing that the active microbial communities in the Olkiluoto deep subsurface were strictly stratified according to aquifer water type.Nevertheless, more rigorous sequencing efforts and more samplings have shown that an archaeal core community consisting of the DeepSea Hydrothermal Vent Euryarchaeotal Group 6 (DHVEG-6), ANME-2D, and Terrestrial Miscellaneous Group (TMEG) archaea may exist in the anaerobic deep groundwater of Olkiluoto (Miettinen et al., 2015).The bacterial core groups in Olkiluoto deep groundwater include at least members of the Pseudomonadaceae, Commamonadaceae, and Sphingomonadaceae (Bomberg et al., 2014(Bomberg et al., , 2015;;Miettinen et al., 2015).The relative abundance of these main groups varies at different depths from close to the detection limit to over 90 % of the bacterial or archaeal community (Bomberg et al., 2015;Miettinen et al., 2015).However, both the archaeal and bacterial communities contain a wide variety of less abundant groups, which are distributed unevenly in the different water-conductive fractures.
The rare biosphere is a concept describing the hidden biodiversity of an environment (Sogin et al., 2006).The rare biosphere consists of microbial groups that are ubiquitously distributed in nature but often present at low relative abundance and may thus stay below the limit of detection.Due to modern high throughput sequencing techniques, however, Published by Copernicus Publications on behalf of the European Geosciences Union.
Table 1.Geochemical and microbiological measurements from 12 different water conductive fractures in the bedrock of Olkiluoto, Finland.The different drill holes are presented at the top of the table.The data are compiled from Posiva (2013) and Miettinen et al. (2015).NPOC: non-purgeable organic carbon; DIC: dissolved inorganic carbon; TDS: total dissolved solids; TNC: total number of cells.4.2 × 10 5 1.0 × 10 5 2.4 × 10 5 2.5 × 10 5 2.1 × 10 5 1.5 × 10 4 na 2.9 × 10 4 5.9 × 10 4 8.7 × 10 4 5.5 × 10 4 2.3 × 10 4 16S qPCR mL −1 Bacteria 7.0 × 10 5 9.5 × 10 3 2.0 × 10 4 3.6 × 10 5 4.9 × 10 4 1.3 × 10 4 7.2 × 10 4 1.5 × 10 5 1.4 × 10 5 1.9 × 10 4 3.2 × 10 4 1.5 × 10 4 Archaea 5.8 × 10 3 2.0 × 10 4 9.9 × 10 3 6.3 × 10 4 6.2 × 10 3 1.5 × 10 2 4.4 × 10 4 5.2 × 10 2 7.5 × 10 2 3.0 × 10 3 2.6 × 10 1 2.8 × 10 2 the hidden diversity of rare microbiota has been revealed.These microorganisms are the basis for unlimited microbial functions in the environment and upon environmental change specific groups can readily activate and become abundant.Access to otherwise inaccessible nutrients activates specific subpopulations in the bacterial communities within hours of exposure (Rajala et al., 2015) and enrich distinct microbial taxa at the expense of the original microbial community in the groundwater (Kutvonen, 2015).Mixing of different groundwater layers due to e.g.breakage of aquifer boundaries and new connection of separated aquifers may cause the microbial community to change and activate otherwise dormant processes.This has previously been shown by Pedersen et al. (2013), who indicated increased sulfate reduction activity when sulfate-rich and methane-rich groundwater mixed.Deep subsurface microbial communities in isolated deep subsurface groundwater fractures are assumed to be stable.However, there are indications that they may change over the span of several years, as slow flow along fractures is possible (Miettinen et al., 2015;Sohlberg et al., 2015).The microbial taxa present in an environment interact with both biotic and abiotic factors.In deep subsurface groundwater the biomass concentration is often low and the sampling efforts may not yield enough biomass for extensive metagenomic analysis of the microbial communities.Tools for predicting metabolic pathways may help to establish a consensus of the microbial metabolic characteristics present in an environment and the possible interactions of the microbial communities with the abiotic environment.Tools, such as PICRUSt (Langille et al., 2013), allow us to estimate microbial metabolic functions based on NGS microbiome data.For example, Tsitko et al. (2014) showed that oxidative phosphorylation was the most important energy-producing metabolic pathway throughout the 7 m depth profile of an acidobacteria-dominated nutrient-poor boreal bog.Cleary et al. (2015) showed that tropical mussel-associated bacterial communities could be important sources of bioactive compounds for biotechnology.This approach is nevertheless hampered by the fact that only little is so far known about uncultured environmental microorganisms and their functions and the PICRUSt approach is best applied for the human microbiome for which it was initially developed (Langille et al., 2013).However, metagenomic estimations may give important indications of novel metabolic possibilities even in environmental microbiome studies.
Using extensive high throughput amplicon sequencing in this study, we aimed to identify the core microbiome in the deep crystalline bedrock fractures of Olkiluoto and also to study the rare microbiome.In addition, we aimed to estimate the prevailing metabolic activities that may occur in the deep crystalline bedrock environment of Olkiloto, Finland.

Background
The Olkiluoto site has previously been extensively described (Posiva, 2013) and is only briefly described here.The island of Olkiluoto situated on the western coast of Finland has approximately 60 drill holes drilled for research and moni- toring purposes.Studies on the chemistry and microbiology of the groundwater have been ongoing since the 1980s.The groundwater is stratified with a salinity gradient extending from fresh to brackish water to a depth of 30 m and the highest salinity concentration of 125 g L −1 total dissolved solids (TDS) at 1000 m depth (Posiva, 2013).The most abundant salinity-causing cations are Na 2+ and Ca 2+ and, anions, Cl − .Between 100 and 300 m depths, the groundwater originates from ancient (pre-Baltic) seawater and has high concentrations of SO 2− 4 .Below 300 m the concentration of methane in the groundwater increases and SO 2− 4 is almost absent.A sulfate-methane transition zone (SMTZ), where sulfate-rich fluid replaces methane-rich fluid, is located at 250-350 m depth.Temperature rises linearly with depth, from ca. 5-6 • C at 50 m to ca. 20 • C at 1000 m depth (Ahokas et al., 2008).The pH of the groundwater is slightly alkaline throughout the depth profile.Multiple drill holes intersect several groundwater-filled bedrock fractures, including larger hydrogeological zones such as HZ20 or HZ21 (Table 1).The bedrock of Olkiluoto consists mainly of micagneiss and pegmatitic granite-type rocks (Kärki and Paulamäki, 2006).The in situ temperature at 300 m depth in the Olkiluoto bedrock is stable at approximately 10 • C and increases linearly to approximately 16 • C at 800 m depth (Sedighi et al., 2013).
This study focused on 12 groundwater samples from water conductive fractures situated between 296 and 798 m below sea level (bsl) and originating from 11 different drill holes in Olkiluoto (Fig. 1).The samples represented brackish sul-fate waters and saline waters (as classified in Posiva, 2013).The samples were collected between December 2009 and January 2013 (Table 1).The physicochemical parameters of the groundwater samples have been reported by Miettinen et al. (2015), but have for clarity been collected here (Table 1).

Sample collection
The collection of samples occurred between December 2009 and January 2013 (Table 1) as described previously (Bomberg et al., 2015;Miettinen et al., 2015;Sohlberg et al., 2015).The samples were obtained from 11 different permanently packered or open drill holes equipped with removable inflatable packers.The position and direction of the drill holes are indicated in Fig. 1.Briefly, in order to obtain indigenous fracture fluids, the packer-isolated fracture zones were purged by removing stagnant drill-hole water by pumping for a minimum of 4 weeks before the sample water was collected.The water samples were collected directly from the drill hole into an anaerobic glove box (MBRAUN, Germany) via a sterile, gas-tight polyacetate tube (8 mm outer diameter).Microbial biomass for DNA extraction was concentrated from 1000 mL samples by filtration on cellulose acetate filters (0.2 µm pore size, Corning) by vacuum suction inside the glove box.The filters were immediately extracted from the filtration funnels and frozen on dry ice in sterile 50 mL cone tubes (Corning).The frozen samples were transported www.biogeosciences.net/13/6031/2016/Biogeosciences, 13, 6031-6047, 2016 on dry ice to the laboratory where they were stored at −80 • C until use.

Nucleic acid isolation
Community DNA was isolated directly from the frozen cellulose-acetate filters with the PowerSoil DNA extraction kit (MoBio Laboratories, Inc., Solana Beach, CA), as previously described (Bomberg et al., 2015).Negative DNA isolation controls were included in the isolation protocol.The DNA concentration of each sample was determined using the NanoDrop 1000 spectrophotometer.

Estimation of microbial community size
The size of the microbial community was determined by epifluorescence microscopy of 4',6 diamidino-2-phenylindole dihydrochloride (DAPI; Sigma, MO, USA) stained cells as described in Purkamo et al. (2013).The size of the bacterial population was determined by 16S rRNA gene-targeted quantitative PCR (qPCR) as described by Tsitko et al. (2014) using universal bacterial 16S rRNA gene-targeting primers fD1 (Weisburg et al., 1991) and P2 (Muyzer et al., 1993), which specifically target the V1-V3 region of the bacterial 16S rDNA gene.The size of the archaeal population in the groundwater was determined by using primers ARC344f (Bano et al., 2004) and Ar744r (reverse complement from Barns et al., 1994) flanking the V4-V6 region of the archaeal 16S rRNA gene.The qPCR reactions were performed in 10 µL reaction volumes using the KAPA 2 × Syrb ® FAST qPCR kit on a Light-Cycler480 qPCR machine (Roche Applied Science, Germany) on white 96-well plates (Roche Applied Science, Germany) sealed with transparent adhesive seals (4titude, UK).Each reaction contained 2.5 µM of relevant forward and reverse primer and 1 µL DNA extract.Each reaction was run in triplicate and no-template control reactions were used to determine background fluorescence in the reactions.
The qPCR conditions consisted of an initial denaturation at 95 • C for 10 min followed by 45 amplification cycles of 15 s at 95 • C, 30 s at 55 • C, and 30 s at 72 • C with a quantification measurement at the end of each elongation.A final extension step of 3 min at 72 • C was performed prior to a melting curve analysis.This consisted of a denaturation step for 10 s at 95 • C followed by an annealing step at 65 • C for 1 min prior to a gradual temperature rise to 95 • C at a rate of 0.11 • C s −1 during which the fluorescence was continuously measured.The number of bacterial 16S rRNA genes was determined by comparing the amplification result (Cp) to that of a 10-fold dilution series (10 1 -10 7 copies µL −1 ) of Escherichia coli (ATCC 31608) 16S rRNA genes in plasmid for bacteria and a dilution series of genomic DNA of Halobacterium salinarum (DSM 3754) for archaea.The lowest detectable standard concentration for the qPCRs was 10 2 gene copies/reaction.Inhibition of the qPCR by template was tested by adding 2.17 × 10 4 plasmid copies containing fragments of the morphine-specific Fab gene from Mus musculus gene to reactions containing template DNA as described in Nyyssönen et al. (2012).Inhibition of the qPCR assay by the template DNA was found to be low.The average crossing point (Cp) value for the standard sample (2.17 × 10 4 copies) was 28.7 (±0.4 SD), while for the DNA samples Cp was 28. .Nucleic acid extraction and reagent controls were run in all qPCRs in parallel with the samples.Amplification in these controls was never higher than the background obtained from the no-template controls.

Amplicon library preparation
This study is part of the Census of Deep Life initiative, which strives to obtain a census of the microbial diversity in the deep subsurface environment by collecting samples around the world and sequencing the 16S rRNA gene pools of both archaea and bacteria.The extracted DNA samples were sent to the Marine Biological Laboratory in Woods Hole, MA, USA, for preparation for HiSeq sequencing using the Illumina technology.The protocol for amplicon library preparation for both archaeal and bacterial 16S amplicon libraries can be found at http://vamps.mbl.edu/resources/faq.php.Briefly, amplicon libraries for completely overlapping paired-end sequencing of the V6 region of both the archaeal and bacterial 16S rRNA genes were produced as previously described (Eren et al., 2013).For the archaea, primers A958F and A1048R containing Truseq adapter sequences at their 5' end were used, and for the bacteria primers B967F and B1064R for obtaining 100 nt long paired-end reads (https:// vamps.mbl.edu/resources/primers.php).The sequencing was performed using a HiSeq 1000 system (Illumina).

Sequence processing and analysis
Contigs of the paired-end fastq files were first assembled with mothur v 1.32.1 (Schloss et al., 2009).Analyses were subsequently continued using QIIME v. 1.8.(Caporaso et al., 2010).Only sequences with a minimum length of 50 bp were included in the analyses.The bacterial and archaeal 16S rRNA sequences were grouped into OTUs (97 % sequence similarity) using both the open reference and closed reference OTU picking strategy and classified using the Green-Genes 13_8 16S reference database (DeSantis et al., 2006).The core archaea and bacteria communities were identified from the OTU tables with the compute_core_microbiome.py function in QIIME using default values, with the exception of the minimum number of samples where an OTU must be detected, which was set to 80 %.The sequencing coverage was evaluated by rarefaction analysis and the estimated species richness and diversity indices were calculated.For comparable αand β-diversity analyses the data sets were normalized by random subsampling of 17 000 sequences / sample for archaea and 140 000 sequences / sample The total number of sequence reads, observed and estimated (Chao1, ACE) number of OTUs, number of singleton and doubleton OTUs, and Shannon diversity index per sample of the bacterial 16S rRNA gene data set.The analysis results are presented for both the total number of sequence reads per sample as well as for data normalized according to the sample with the lowest number of sequence reads, i.e. 140 000 random sequences per sample.(b) The total number of sequence reads, observed and estimated (Chao1, ACE) number of OTUs, number of singleton and doubleton OTUs, and the Shannon diversity index per sample of the archaeal 16S rRNA gene data set.The analysis results are presented for both the total number of sequence reads per sample as well as for data normalized according to the sample with the lowest number of sequence reads, i.e. 17 000 random sequences per sample.for bacteria.Microbial metabolic pathways were estimated based on the 16S rRNA gene data from the closed OTU picking method using the PICRUSt software (Langille et al., 2013) on the web-based Galaxy application (Goecks et al., 2010;Blankenberg et al., 2010;Giardine et al., 2005).The predicted KO numbers were plotted on KEGG pathway maps (http://www.genome.jp/kegg/)separately for the bacterial and archaeal predicted metagenomes, with a threshold of a minimum of 100 genes in the total estimated from all samples.

Statistical analyses and data visualization
The similarity of the archaeal and bacterial communities between the different samples was tested by principal coordinate analysis (PCoA) using the Phyloseq package in R (Mc-Murdie and Holmes, 2014; R Core Team, 2013).The analysis was performed using the raw OTU tables outputted by QIIME.In addition, a PCoA analysis showing the effect of library size on the ordination of the samples was calculated using vegan (Oksanen et al., 2016).The Bray-Curtis distance model was used for both analyses.The samples were hierar-chically clustered in a UPGMA tree based on the raw OTU counts using the heatmap function of Phyloseq in R.

Sequence statistics, diversity estimates and sequencing coverage
The number of bacterial V6 sequence reads from the 12 samples varied between 1.4 and 7.8 × 10 5 reads, with a mean sequencing depth of 2.9 × 10 5 (±1.8 × 10 5 standard deviation) reads / sample (   groundwater fracture zones of Olkiluoto.The bacterial H was on average 13 (±0.74),ranging from 11 to 14 between the different samples.The archaeal H was on average 11 (±1.2),ranging from 9 to 12 between the samples.A total of 468 684 archaeal and 301 458 bacterial OTUs were obtained in this study.

Microbial communities
From the bacterial V6 sequences 49 different bacterial Phyla were detected (Supplement 1).These phyla included 165 bacterial classes, 230 orders, 391 families and 651 genera.The greatest number of sequences, between 21.83 and 47.94 % per sample, clustered into an undetermined bacterial group (Bacteria, Other).This may be due the fact that sequences of poorer quality may be difficult to classify, especially as the sequences are short, or these sequences belong to thus far uncultured and unknown bacterial species, the so called microbial dark matter (Solden et al., 2016).The archaea were represented by two identified phyla, the Euryarchaeota and the Crenarchaeota (Supplement 2).These included 21 classes, 38 orders, 61 families, and 81 genera.Between 4.7 and 35.0 % of the archaeal sequences of each sample were classified to unassigned archaea, with a general increase in unassigned archaeal sequences with increasing depth.
The archaeal and bacterial core communities were determined as OTUs present in at least 80 % of the samples.Of the more than 4.6 × 10 5 archaeal OTUs, the core community consisted of 82 OTUs belonging to three archaeal orders, the E2 of the Thermoplasmatales, the Methanobacteriales, and the Methanosarcinales (Fig. 3).Additionally, a great proportion of the OTUs of the core community did not receive any taxonomic identity other than archaea.The most common archaeal family of the core community was the ANME-2D belonging to the Methanosarcinales.The bacterial core community consisted of only 26 OTUs, compared to more than 3.0 × 10 5 bacterial OTUs in total (Fig. 3).These OTUs belonged to six different families, the Alteromonadales, Burkholderiales, SB-45, Sphingomonadales, Syntrophobacterales, and Thiobacterales.In addition, a great portion of the core community OTUs were classified only as unassigned bacteria.The most abundant of the bacterial core community OTUs belonged to Thiobacteriaceae and Comamonadaceae.In both the archaeal and bacterial sequence data a great proportion of the sequence reads were only identified as archaea or bacteria, without more detailed taxonomic assignments.The core OTUs were distributed with different abundances in the different samples (Fig. 3).Most of the OTUs detected were present in less than 20 % of the samples (Fig. 4).

Environmental parameters and the microbial communities
The microbial community profiles of the different samples were clustered in a UPGMA tree based on the OTU tables and the Bray-Curtis distance model (Fig. 5).The archaeal and bacterial communities clustered according to the OTUs detected in the samples, but not clearly according to any physicochemical parameter.In the PCoA analysis, however, the archaeal communities in water containing total dissolved solids (TDS) between 10 670 mg and 18 580 mg L −1 clustered together (Figs.6a, S2a), but no similar clustering could be observed in the bacterial communities (Figs.6b, S2b).Nevertheless, Principal coordinate 1 determined 14.9 % and coordinate 2 12.6 % of the variance in the archaeal communities and 20 and 17.2 %, respectively, of the variance in the bacterial communities (Fig. 6ab, S2cd).

Predicted metabolic functions of the deep subsurface microbial communities
The putative metabolic functions of the microbial communities at different depths were predicted using the PICRUSt software, which compares the identified 16S rRNA gene sequences to those of known genome sequenced species, thereby estimating the possible gene contents of the uncultured microbial communities.The analysis is only an approximation, but may give an idea of the possible metabolic activities in the deep biosphere.In order to evaluate the soundness of the analysis, a nearest sequenced taxon index (NSTI) for each of the bacterial and archaeal communities was calculated by PICRUSt.An NSTI value of 0 indicates high sim-   ilarity to the closest sequenced taxon, while NSTI = 1 indicates no similarity.The NSTI of the bacterial communities at different depths varied between 0.045 in sample OL-KR44 and 0.168 in sample OL-KR13 (Table 3).The NSTI for archaea were much higher, ranging from 0.141 in sample OL-KR9 at a depth of 432 m to 0.288 in OL-KR44.This indicates that the metagenomic estimates are only indicative.The estimated microbial metabolism did not differ noticeably between the different depths (Fig. 7a and b).The most important predicted metabolic pathways included membrane transport in both bacterial and archaeal communities.The most common pathways predicted for carbohydrate metabolism were the butanoate, propionate, glycolysis/gluconeogenesis, and pyruvate metabolism pathways for the bacteria and glycolysis/gluconeogenesis and pyruvate metabolism pathways for the archaea (Fig. 8).Glucose is predicted to be converted into pyruvate and further to acetyl-CoA by both bac-teria and archaea.The bacterial community may produce and utilize acetate.Both the bacterial and archaeal communities are predicted to fix carbon via the Wood-Ljungdahl (WL), reverse citric acid cycle (rTCA), and Calvin pathways.
Methane is most likely produced from methylamines, CO 2 , and methanol by the methanogenic archaea.In addition to the strong evidence of methanogenesis in the archaeal community, the reverse methanogenesis, i.e. anaerobic methane oxidation by the ANME-2D archaea, is possible.Based on the predicted metagenomes the bacterial community is not able to oxidize methane or hydrolyze methanol, but the methylotrophs present may use formic acid and trimethylamines.
The most abundant predicted energy metabolic pathway in the bacterial communities was the oxidative phosphorylation (Fig. S3), while for the archaea the methane metabolism was the most important (Figs. 7,8).Our analyses predicted utilization of propanoate and butanoate (Fig. 7) by the bacterial communities as well as completely covered fatty acid biosynthesis and degradation pathways, which indicate that the bacterial community may be capable of fermentation (Fig. S4a  and b).The bacterial community is predicted to reduce nitrate both through dissimilatory nitrate reduction to ammonia and through denitrification to nitrous oxide (Fig. S5).In addition, the predicted metagenomes indicate that nitrogen is fixed to ammonia by both archaea and bacteria.The ammonia is furthermore predicted to be used as raw material for L-glutamate synthesis (Fig. S5).Sulfur metabolism was not a major pathway in either the bacterial or archaeal communities according to the predicted number of genes.However, assimilatory sulfate reduction was indicated in both the bacterial and archaeal communities, while dissimilatory sulfate reduction and sulfur oxidation were indicated only in the bacterial communities (Fig. S6).
Several amino acid synthesis pathways were predicted (Fig. 7), of which the most prominent were the alanine, aspartate and glutamate synthesis, arginine and proline synthesis, cysteine and methionine synthesis, glycine, serine and threonine synthesis, phenylalanine, tyrosine and tryptophan synthesis, and the valine, leucine, and isoleucine synthesis pathways.
Different types of membrane transport (ABC transporters) were predicted where sulfate and iron (III) would be taken up by the bacteria and tungstate, molybdate, proline, zinc, cobalt, and nickel by both archaea and bacteria (Fig. S7).The estimated number of genes for both the purine and pyrimidine metabolism was more than 2 times higher in the archaeal community than in the bacterial community (Fig. 7a and b).

Discussion
The phenotypic characteristics of the Fennoscandian Shield deep subsurface microbial communities are still largely unknown, although specific reactions to introduced environmental stimulants have been shown (e.g.Pedersen et al.,  2013, 2014;Rajala et al., 2015;Kutvonen, 2015).Nevertheless, the connection of these microbial responses to specific microbial groups is still only in an early phase.Metagenomic and gene-specific analyses of deep subsurface microbial communities have revealed prominent metabolic potential of the microbial communities, which appear to be associated with the prevailing lithology and physicochemical parameters (Nyyssönen et al., 2014;Purkamo et al., 2015).It has also been shown with fingerprinting methods with ever increasing efficiency that the bacterial and archaeal communities are highly diverse in the saline anaerobic Fennoscandian deep fracture zone groundwater (Bomberg et al., 2014(Bomberg et al., , 2015;;Nyyssönen et al., 2012Nyyssönen et al., , 2014;;Pedersen et al., 2014;Miettinen et al., 2015;Sohlberg et al., 2015).Nevertheless, the concentration of microbial cells in the groundwater is quite low (Fig. 2, Table 1).In accordance with other Fennoscandian deep subsurface environments (Purkamo et al., 2016), most of the microbial communities at different depths in Olkiluoto bedrock fractures consist of bacteria.Archaea have in general been shown to constitute at most approximately 1 % of the Fennoscandian deep bedrock groundwater (Purkamo et al., 2016).However, at specific depths in Olkiluoto (328, 423 m) the archaea contributed with over 50 % of the estimated 16S rRNA gene pool (Table 1).The major archaeal groups present at these depths were the ANME-2D archaea, indicating that nitrate-mediated anaerobic oxidation of methane may be especially common (Haroon et al., 2013).The high abundance of archaea in Olkiluoto is special for this environment.Archaea have also been quantified from the Outokumpu deep scientific borehole (Purkamo et al., 2016), but unlike the situation in Olkiluoto the archaeal community was less than 1 % of the total community at best.
Previously, using 454 amplicon sequencing, we have observed OTU numbers of approximately 800 OTUs per sample covering approximately 550 bacterial genera (or equivalent groups) and approximately 350 archaeal OTUs including approximately 80 different genera (or equivalent groups; Miettinen et al., 2015).Miettinen et al. (2015) defined the OTUs' 97 % sequence homology and the number of sequence reads per sample was at most in the range of 10 4 .In contrast, our sequence-read numbers were 10-to 100-fold higher and the number of OTUs per sample in general 100fold higher.This indicates that a greater sequencing depth increases the number of taxa detected from the subsurface environment and allows us a novel view of the previously rare biosphere.Nevertheless, in comparison to the high number of OTUs detected, the number of identified genera, 651 and 81 bacterial and archaeal genera, respectively, seems low.On the other hand, this indicates that the sequencing depth has been sufficient to detect most of the prokaryotic groups present.Nevertheless, the obtained numbers of OTUs per sample in this study were huge (Table 2).This may reflect the high level of variability in the short sequence reads of the V6 region used in this study.Huse et al. (2008) discussed the problem with short sequence reads, because these reads very often match full-length 16S rRNA gene sequences belonging to several different taxa, and a precise taxonomic assignment of the short sequence read cannot be done.As shown in our study, taxonomic assignments, such as "Pro-teobacteria_other", were common and may be due to multiple matches for the individual sequence reads obtained in the identification step of the analysis.However, it may also be possible that the sequences represent novel microbial clades, the so-called microbial dark matter (Solden et al., 2016), which do not have representatives in the databases yet.
In general, the microbial communities at different depths grouped loosely into clusters (Fig. 6).Although no clear environmental factor seemed to drive the microbial communities at different depths, the core communities appeared to be more similar in samples from similar depths, especially in the bacterial communities (Fig. 3).OTUs belonging to both sulfate reducers (Desulfobacteraceae) and sulfur oxidizers (Thiobacteraceae) were present in the bacterial core community.The archaeal core community consisted mostly of methane-oxidizing ANME-2D archaea.Interestingly, however, their abundance was higher in the deeper samples.Previous studies on the Finnish deep biosphere have shown that the microbial communities at different sites vary strongly from each other.Purkamo et al. (2015) investigated the bacterial and archaeal communities of different fracture zones of the Outokumpu deep scientific borehole and found that the majority of the bacterial populations at depths between 180 and 500 m depth consist of Betaproteobacteria belonging to the Commamonadaceae, and the archaeal communities consist of Methanobacteriaceae and Methanoregula.
The core communities, defined as OTUs present in all the studied samples, accounted for between 0.2 and 11.7 % of the archaeal and 0.4-4.1 % of the bacterial sequence reads, respectively.These proportions are surprisingly low and show that most of the archaeal and bacterial communities (83.3-99.8 and 95.5-99.6 % of the archaeal and bacterial sequence reads, respectively) consist of taxa that are present in only specific samples.Nevertheless, the short read length and high sequence variability within the V6 region may overestimate this diversity.Nevertheless, on a genus level between 95 and 99 % of the archaeal sequence reads fell with only 25 genera, which were present in all samples.Likewise for the bacterial communities, 80-97 % of the sequence reads belonged to 95 bacterial genera that were detected in all samples.The number of OTUs found in at least 80 % of the samples greatly outnumbered the number of OTUs present in all samples (i.e. the core community) both in the archaeal and bacterial communities.The number of archaeal and bacterial OTUs present in at least 50 % of the samples was only 800 and less than 600, respectively.OTUs present in at least 20 % of all samples (i.e. two to three samples) consisted only of approximately 10 000 archael and 30 000 bacterial OTUs.Compared to the total number of 468 684 archaeal and 301 458 bacterial OTUs detected in total in these samples, the proportion of rare OTUs present in only one or two samples is huge.Our results agree with Sogin et al. (2006) and Magnabosco et al. (2014), who showed that a relatively small number of taxa dominate deep-sea water and deep groundwater habitats, respectively, but a rare microbiome consisting of thousands of taxonomically distinct microbial groups are detected at low abundances.What this means for the functioning of the deep subsurface is that the microbial communities have the capacity to respond and change due to changes in environmental conditions.For example, Pedersen et al. (2014) showed that by adding sulfate to the sulfatepoor but methane-rich groundwater in Olkiluoto, the bacterial population changed over the span of 103 days from a non-sulfate-reducing bacteria (non-SRB) community to a community dominated by SRB.In addition, a change in the geochemical environment induced by H 2 and methane impacted the size, composition, and functions of the microbial community and ultimately led to acetate formation (Pedersen, 2012(Pedersen, , 2013;;Pedersen et al., 2014).
The metabolic pathways predicted by PICRUSt are far from certain when uncultured and unculturable deep subsurface microbial communities are concerned.The NSTI values for both the bacterial and well as the archaeal communities were high, indicating that closely related species to those found in our deep groundwater have yet to be sequenced.This is in accordance with Langille et al. ( 2013), who showed that environments containing a high degree of unexplored microbiota also tend to have high NSTI values.Staley et al. (2014) also showed in a comparison between PICRUSt and shot gun metagenomic sequencing of riverine microbial communities that PICRUSt may not be able to correctly assess rare biosphere functions.Nevertheless, Langille et al. (2013) showed that PICRUSt may predict the metagenomic content of a microbial community more reliably than shallow metagenomic sequencing.Although PICRUSt does not give as detailed results as metagenomics or genomic analyses may give, it is still a useful tool for predicting functions in microbial communities when the possibility for metagenomics analysis may be impossible, e.g.due to low biomass in the samples.

Energy metabolism
Deep subsurface environments are often called energydeprived environments dominated by autotrophic microorganisms (Hoehler and Jorgensen, 2013).However, recent reports indicate that heterotrophic microorganisms play a greater role than the autotrophic microorganisms in Fennoscandian deep crystalline subsurface environments (Purkamo et al., 2015).Heterotrophic communities with rich fatty acid assimilation strategies have been reported to fix carbon dioxide on the side of e.g.fermenting activities in order to replenish the intracellular carbon pool, which otherwise would be depleted.Wu et al. (2016) also found by metagenomic analyses that fermentation was a major metabolic activity in the microbial community of Swedish deep groundwater.Our results agree with Purkamo et al. (2015) that a greater proportion of the microbial community is involved in carbohydrate and fatty and organic acid oxidation than in fixation of inorganic carbon.Nevertheless, autotrophic carbon fixation pathways were predicted in the analysis with PICRUSt, indicating that both the archaeal and bacterial communities include autotrophic members, although these microorganisms might not be obligate autotrophs.It was also noted that even though evidence of methane oxidation could not be inferred from the PICRUSt predictions (no pmoA genes), the bacterial community may oxidize formate, which is in agreement with the findings reported by Wu et al. (2016).
Several carbon fixation pathways were predicted in the metagenomes: the Calvin cycle, the reductive TCA (rTCA) cycle, and the Wood-Ljungdahl (WL) pathway.The WL pathway is considered the most ancient autotrophic carbon fixation pathway in bacteria and archaea (Fuchs, 1989;Martin et al., 2008;Berg et al., 2010;Hügler and Sievert, 2011) and was found in both the bacterial and archaeal communities.In the archaeal community the Calvin cycle and the rTCA were especially pronounced in the samples from 296 and 405-423 m depth, and somewhat lower at 510-527 m depth.The bacterial communities are predicted to fix CO 2 at almost all depths, with the exception of 405 and 559 m depth.Nevertheless, our results agree with Nyyssönen et al. (2014), who showed by metagenomic analysis that the microbial communities at different depths of the Outokumpu scientific deep drill hole may fix carbon in several ways, of which the rTCA, the WL pathway, and the Calvin cycle were identified.Magnabosco et al. (2016) showed that the WL pathway was the dominating form of carbon fixation in metagenomes of 3 km deep Precambrian crust biospheres in South Africa.Dong et al. (2014) also suggested that microorganisms in a low-energy deep subsurface environment may have several strategies for e.g.carbon fixation, as shown in the Halomonas sulfidaeris, in order to access as many resources as possible.The predicted methane metabolism (methane and methyl compound consumption) and oxidative phosphorylation were equally strong in the bacterial community.Sulfur metabolism was not predicted to be a common pathway for energy in either the archaeal or the bacterial communities.However, PICRUSt predicted bacteria with either assimilative or dissimilative sulfate reduction to be present.Sulfur oxidation through the sox system was in general not predicted, but the soxD gene was predicted, and oxidation of thiosulfate to sulfate may be possible (Fig. S6).According to the predicted metagenomes, nitrate may be reduced both through dissimilatory nitrate reduction to ammonia and through denitrification to nitrous oxide by the bacteria.In addition, nitrogen may be fixed to ammonia by both archaea and bacteria.The ammonia may be used as raw material for L-glutamate synthesis.
Oxidative phosphorylation was predicted as one of the most prominent energy-generating metabolic pathways in the www.biogeosciences.net/13/6031/2016/Biogeosciences, 13, 6031-6047, 2016 bacterial community.This indicates that ATP is generated by electron transfer to a terminal electron acceptor, such as oxygen, nitrate, or sulfate.In the archaeal community the oxidative phosphorylation was not as strongly indicated, but this may be due to missing data on archaeal metabolism in the KEGG database.
The main energy metabolism of the archaeal communities was predicted to be the methanogenesis, especially at 296 and 405 m.Methanogenesis was also common at all other depths except 330-347, 415, and 693-798 m.Methane is produced from CO 2 -H 2 and methanol, and from acetate, although evidence for the acetate kinase enzyme was lacking.Methanogenesis from methylamines may also be possible, especially at 296 and 405 m.Methane oxidation using methane monoxygenases and methanol dehydrogenases does not occur in either bacterial or archaeal communities.It should be noted, however, that the ANME-2D archaea are likely to use the methanogenesis pathway in the reverse for oxidizing methane anaerobically to carbon dioxide (Haroon et al., 2013).The produced carbon dioxide may be fixed by the same archaea and turned into acetate, which may serve as a carbon substrate and electron donor and acceptor for a large variety of microorganisms in the groundwater.

Carbohydrate metabolism
Glycolysis/gluconeogenesis was one of the most common carbohydrate-metabolizing pathways predicted for both the archaeal and bacterial communities (Fig. 8).Pyruvate from glycolysis is oxidized to acetyl-CoA by both archaea and bacteria and used in the TCA cycle.The TCA cycle provides for example raw material for many amino acids, such as lysine and glutamate.The butanoate and propanoate metabolisms were also common in the bacterial communities, indicating fermentative metabolism and capability of fatty acid oxidation.

Amino acid metabolism
The predicted metagenomes indicated that non-essential amino acids, such as alanine, aspartate, and glutamate, may be produced from ammonia and pyruvate or oxaloacetate, especially in the archaeal populations.In the archaeal population proline appeared to be produced from glutamate.Despite the predicted low use of sulfate as an energy source in the microbial communities, sulfate and other sulfur compounds could be taken up for the production of the amino acids cysteine and methionine by both the archaeal and bacterial communities.A higher predicted relative abundance of genes involved in aromatic amino acid synthesis (phenylalanine, tyrosine, tryptophane) was seen in the archaeal than in the bacterial communities.Both the archaeal and the bacterial communities were predicted to synthesize branched chained amino acids (isoleucine, leucine, and valine), but only the bacteria degrade them.Proteobacteria especially have been shown to be able to use the branched chained amino acids (isoleucine, leuscine, and valine) and short chained fatty acids (acetate, butyrate, propionate) as the sole energy and carbon source (Kazakov et al., 2009).The branched chained amino acids function as raw material in the biosynthesis of branched chained fatty acids, which regulate the membrane fluidity of the bacterial cell.In salt stress conditions, the proportion of branch chained fatty acids in the membranes decreases.

Membrane transport
According to the predicted metagenomes, the microbial cells transport sulfate into the cell, but do not take up nitrate.Nitrogen is taken up as glutamate but not as urea.Iron is taken up by an Fe(III) transport system and an iron complex transport system in the bacterial communities, but generally only by the iron complex transport system in archaea.However, Fe(III) transport system may also exist in the archaeal communities at 405 to 423 m depth, where some manganese/iron transport systems could also be found.According to the metagenome predictions, molybdate and phosphate are transported into the cell by molybdate and phosphate ATPases, respectively.Nickel is taken up mainly by a nickel/peptide transport system but also to some extent by a cobalt/nickel transport system.Zinc is taken up to some extent by a zinc transport system, but transport systems for manganese, manganese/iron, manganese/zinc/iron, or iron/zinc/copper were not predicted.Ammonia was predicted to be taken up by an Amt transport system.

Conclusions
The wide diversity of microbial groups in the deep Fennoscandian groundwater at the Olkiluoto site revealed that the majority of the microbial community present belongs to only a few microbial taxa, while the greatest part of the microbial diversity is represented by low abundance and rare microbiome taxa.The core community was present in all tested samples from different depths, but the relative abundance of the different taxa varied in the different samples.Nevertheless, the proportion of OTUs found in only a small proposition (e.g.20 %) of the samples far surpassed the number of OTUs included in the core communities.Fermentation or oxidation of fatty acids was a common carbon cycling and energy harvesting metabolic pathways in the bacterial communities, whereas the archaea may either produce or consume methane.Glycolysis/gluconeogenesis was predicted to be common in both the archaeal and bacterial communities.In addition, both the bacterial and archaeal communities were predicted to contain several different common carbon fixation pathways, such as the Calvin cycle and the reductive TCA and the Wood-Ljungdahl pathway.

Figure 1 .
Figure 1.Map of Olkiluoto.The boreholes used in this study are marked with a turquoise triangle and the attached black line depicts the direction of the borehole (with courtesy of Pöyry Oy, 17 November 2015, by Eemeli Hurmerinta).

Figure 2 .
Figure 2. The concentration of (a) microbial cells mL −1 determined by epifluorescence microscopy and the estimated concentration of (b) bacterial and (c) archaeal 16S rRNA gene copies mL −1 groundwater determined by qPCR in water conductive fractures situated at different depths in the Olkiluoto bedrock.

Figure 3 .
Figure 3.The core (a) archaeal and (b) bacterial community OTUs detected from at least 80 % of the samples with heatmaps on the abundance of the (c) archaeal and (d) bacterial core community profiles.In (a) and (b) the OTUs are stacked in the columns according to the number of sequence reads, with the most abundant OTUs at the bottom of the columns.The OTU segments of the columns are coloured according to the family to which they belong.Each order is presented as a separate column for each sample.

Figure 4 .
Figure 4.The number of shared (a) archaeal and (b) bacterial OTUs in the 12 different samples.The number of shared OTUs is shown on the y axis and the proportion of samples on the x axis.

Figure 5 .
Figure 5.A UPGMA cladogram clustering the samples based on the 1000 biggest OTUs of the (a) archaeal and (b) bacterial OTU profile according to the Bray-Curtis distance model.Black indicates low abundance and red colour indicates high abundance.(c) and (d) show the corresponding physicochemical parameters as shown inTable 1, with the lowest values in green, medium values in yellow, and high values in red.

Figure 6 .
Figure 6.Principal coordinate analysis (PCoA) on all the OTU profiles of the different samples based on the Bray-Curtis distance model for the (a) archaeal and (b) bacterial communities.The points indicate water type, where the circle is for brackish sulfate-rich water, the square is for brackish chloride-rich water, and the diamond is for saline water.The colouring of the points is according to the concentration of total dissolved solids (TDS) as indicated in the upper left corner of (a).

Figure 7 .
Figure 7.The relative abundance of predicted genes of the most abundant pathways identified in the (a) archaeal and (b) bacterial populations in the PICRUSt analysis.The pathways are presented according to KEGG.The samples are ordered according to depth, with OL-KR13/296 m as the innermost sample and OL-KR29/798 m as the outermost sample.

Figure 8 .
Figure 8.The microbial carbon metabolism pathway according to KEGG.The predicted genes combined from all samples were plotted on the map.Green arrows indicate enzymes predicted only in the archaeal communities, red arrows indicate genes predicted only in the bacterial communities, and black arrows show enzymes predicted in both the archaeal and bacterial communities.

Table 3 .
The nearest sequenced taxon index (NSTI) values for the archaeal and bacterial communities in the 12 different samples according to PICRUSt.The NSTI value describes the sum of phylogenetic distances of each OTU to its nearest relative with a sequenced reference genome, and measures substitutions per site in the 16S rRNA gene and the weighted frequency of each OTU in a sample data set.A higher NSTI value indicates greater distance to the closest sequenced relatives of the OTUs in each sample.