Survey on the chemical composition of several tropical wood species

Survey on the chemical composition of several tropical wood species within- species variability. Large differences were found between trees of a given species for extraneous components, and more than one tree should be needed per species. For density, lignin, pentosan and cellulose, the distribution of values was nearly symmetrical, with mean values of 720 kg/m 3 for density, 29.1% for lignin, 15.8% for pentosan, and 42.4% for cellulose. There were clear differences between species for lignin content. For extraneous components, the distribution was very dissymmetrical, with a minority of woods rich in this component composing the high value tail. A high value for any extraneous component, even in only one tree, is sufficient to classify the species in respect of that component. Siliceous woods identified by silica bodies in anatomy have a very high silica content and only those species deserve a silica study.


Survey on the chemical composition of several tropical wood species
Variability in the chemical composition of 614 species is described in a database containing measurements of wood polymers (cellulose, lignin and pentosan), as well as overall extraneous components (ethanol-benzene, or hot water extracts and ash, with a focus on silica content). These measurements were taken between 1945 and 1990 using the same standard protocol. In all, 1,194 trees belonging to 614 species, 358 genera and 89 families were measured. At species level, variability (quantified by the coefficient of variation) was rather high for density (27%), much lower for lignin and cellulose (14% and 10%) and much higher for ethanol/benzene extractives, hot water extractives and ash content (81%, 60% and 76%). Considering trees with at least five different specimens, and species with at least 10 different trees, it was possible to investigate within-tree and withinspecies variability. Large differences were found between trees of a given species for extraneous components, and more than one tree should be needed per species. For density, lignin, pentosan and cellulose, the distribution of values was nearly symmetrical, with mean values of 720 kg/m 3 for density, 29.1% for lignin, 15.8% for pentosan, and 42.4% for cellulose. There were clear differences between species for lignin content. For extraneous components, the distribution was very dissymmetrical, with a minority of woods rich in this component composing the high value tail. A high value for any extraneous component, even in only one tree, is sufficient to classify the species in respect of that component. Siliceous woods identified by silica bodies in anatomy have a very high silica content and only those species deserve a silica study.

Introduction
The use of wood for pulp and paper making was the result of German inventions in the second part of the 19 th century, and the wood-based pulp industry grew quickly at the beginning of the 20 th century (Ek et al., 2009). The Technical Association of the Pulp and Paper Industry (TAPPI 1 ) was founded in 1915 and there was demand for more knowledge about wood chemistry. The Chemistry of Wood (Hawley and Wise, 1926) was the first reference book on the subject, with a large discussion about methods for analysing carbohydrate (cellulose, hemicelluloses, pectin), lignin, extractives and ash. Methods were discussed at TAPPI meetings, leading to official standards for extractives in 1933 (T 204), ash in 1934 (T 211), pentosan in 1948 (T 223) and lignin in 1954 (T 222).
In France, there was a request in the 1940s to carry out a feasibility analysis for pulp and paper making using forest resources from tropical countries in the French colonies and overseas departments (Le Cacheux, 1949). The very wide diversity of species in each forest plot was a challenge. From the outset, two alternatives were examined: single-species plantations, or mixing a rather large number of dominant species (fewer than 20) from a large natural forest zone (Quint, 1951). Pulp and paper tests were performed at laboratory and industrial levels in the two cases (Pétroff, 1965(Pétroff, , 1976. Although pulp can be obtained using a large diversity of species, technical problems were substantial and the economic outcome was doubtful (Pétroff, 1960(Pétroff, , 1976Tissot, 1989).
Knowledge of the chemical composition of the selected species was needed and that objective was launched by a State PhD thesis (Morize, 1953) dedicated to setting up, within the CTFT (French Technical Centre for Tropical Forests) cellulose division, the best protocols for tropical woods using the state of the art discussed at TAPPI meetings. Methods were precisely described in the thesis, which was reproduced in its entirety in a reference book (Savard et al., 1954) and further discussed in a second book (Savard et al., 1959). They were very similar to those from the US Forest products laboratory (Pettersen, 1984) and were constantly used in the laboratory up to 1990.
We believe that there was no variation within the standard protocol for the following reasons: all the measurements were performed in the same laboratory, with the same equipment, by technicians well trained for "quality work" although it was not yet an ISO standard (two of the authors were colleagues of these persons from 1973 to 1994). Two thirds of the measurements 1 http://www.tappi.org/ were carried out between 1946 and 1961 (figure 1). Between 1964 and 1990, the effort was much more limited, mainly linked to the use of tropical wood for energy purposes (Doat, 1977;Pétroff and Doat, 1978) and to new interest in the pulp and paper industry in French Guiana (Tissot, 1989).
In order to check possible changes we look at variations of results during succeeding years. There are no significant differences between periods except for the species sampling, much larger in the first period. We have decided to open these data files to the scientific community because we trust in its quality, in terms of continuity.
Mostly hardwood species (599) were measured, including a very small number of temperate hardwood species (5), with only 15 gymnosperm species being included. Seven basic parameters are discussed: hot water extracts, ethanol-benzene extracts, ashes, silica content within ashes, Klason lignin, pentosan, Kürschner cellulose.

Methods
The analytical methods were largely described and discussed in the two reference books (Savard et al., 1954(Savard et al., , 1959, and they were very similar to those described in Pettersen (1984). We will just give a summary of the methods used to measure the presented parameters.

Sample production
All specimens came from trees with a reference in the CIRAD collection (Langbour et al., 2019). The basic specimen was a clear-wood rod measuring 34 cm (L direction) x 2 cm (R direction) x 2 cm (T direction) delivered by the carpenter's workshop (same specimen as for flexure testing).
Each rod was chipped using a standardized procedure. The chips (50 g of air-dried wood) were milled into wood powder by centrifugal separation (12,000 rpm) with breaks every 30 minutes in order to avoid excessive heating.  1940 1950 1960 1970 1980 1990 Number of tests per year Calendar year A 40-mesh (40 square apertures per inch, or 6.4 mm side length) followed by a second 80-mesh (80 square apertures per inch or 3.2 mm side length) was used to obtain a calibrated 40/80 powder for chemical analysis.

First step, ethanol/benzene extract (AB ext)
AB is the acronym for a 1/1 volume-based ethanol (95°) / benzene (pure) mixture used to extract most organic materials that are insoluble in water. Twenty to 25 g of 40/80 powder was placed in a Soxhlet extraction apparatus with the mixture for a total of 8 hours' extraction, before passing through ether solvent for 6 hours (10' each time). Relative mass loss (as compared to anhydrous wood gross weight) after total drying (105 °C) was used for the AB extract value.
AB extraction was always carried out before all the other analyses, except for silica content.

Water extract (W ext)
Hot water (at boiling temperature) can extract mineral salts, tannins, starch, gums and some sugars. The protocol was: 1.5 g of powder (after AB extraction) in 100 mL of hot water for 8 hours. Relative mass loss (as compared to anhydrous wood gross weight before AB extraction) after total drying (105 °C) was used for the water extract value.

Ash content (Ash)
Total ash content was obtained after incineration in an electric oven at 425 ± 10 °C, using 1.5 g of powder up to constant mass in the oven.

Silica content (Sil)
Silica often only amounts to a few percent of ash content, which in turn accounts for less than 1% of dry wood mass. In order to use 1 g of ash, it takes a rather large amount of initial wood mass. Small wood sticks (larger than a big match) were calcined at 425 °C to obtain at least 1 g of ash for silica analysis. Its purity was confirmed by hydrofluoric acid action (Besson, 1946).

Lignin (Lig)
Lignin content was measured using the Klason method (Pettersen, 1984). Sulphuric acid at a concentration of 67% was used in a ratio of 30 mL of acid for 1.5 g of powder after AB extraction. The value reported in the table of results is the mass ratio of dried lignin to wood gross dry weight.

Pentosans (Pent)
For hardwoods, five carbon sugars (pentosans) are the major hemicelluloses (often above 80%), and they were measured using furfural analysis. A 100 mL volume of hydrochloric acid at 13.2% was used for 0.5 to 0.7 g of AB extracted powder.

Cellulose (Cell)
The Kürschner and Hoffer method was used. A water bath regulated at water boiling temperature was used with a 50 ml solution of 1 volume of nitric acid at 48° Baumé + 4 volumes of 95° alcohol for 1.5 g of AB extracted powder, for 1 hour. After washing with alcohol and filtration, the extraction process was repeated three more times (total of four extraction processes), after which the extract was dried at 105 °C after washing and rinsing in alcohol and ether.

Balance (Total)
Summation of the AB extract + Water extract + Ash + Lignin + Pentosan + Cellulose parameters should be near the 100% value (a total above 100% can be due to a small difference resulting from uncertainties throughout the processes). The main reason for rather large differences was the lack of six carbon sugar hemicelluloses constituents, mainly for gymnosperms, where they generally account for around half the hemicelluloses content. However, some analyses of mannan and galactan for tropical hardwoods (Savard et al., 1954(Savard et al., , 1959Pettersen, 1984) have proved that these six carbon sugar contents in hemicelluloses can be over 5% of hardwood gross dry mass. This means that the tables do not give a value for the total hemicelluloses content, but the difference between the balance and 100 is mostly representative of the share of hemicelluloses that are not taken into account in pentosan.

Database and statistical methods
All the informative data and metadata on the collection have been recorded in digital files since 1980 (Gérard and Narboni, 1996). In the data file associated with this paper, the botanical names have been updated and mean density values have been added at species level (unless there was a density value associated with the CTFT id in the wood collection).
Basic statistical analyses were performed using XLSTAT software. The data description table includes the number of present and missing data, minimum, maximum, 1 st quartile, median, 3 rd quartile and mean (with its standard deviation) values for each parameter, as well as the coefficient of variation (CV), skew (Pearson) and kurtosis (Pearson) of the distribution. A box plot is also given for each parameter. The box plot figures the quartiles (the band inside the box is the median). Whiskers plot the lowest data item still within the 1.5 IQR (inter quartile range) of the lower quartile, and the highest data item still within the 1.5 IQR of the upper quartile.
For the histogram presentation, the amplitude was chosen for each parameter, in order to have a clear description of the data. Normality of distribution was verified by four tests: Shapiro-Wilk, Anderson-Darling, Lilliefors and Jarque-Bera. In the case of normal distribution, a Pearson type correlation analysis was used, and a Spearman type for non-normal distribution.

Description of the database
There are five data sheets and one comment sheet.

Test sheet
This gathers the results for 1,287 complete tests (all parameters present, except for some missing silica contents), in CTFT reference numerical order with 15 columns: Test (range of the set of measurements), Year, Tree, Origin, Family, Species, Density (density values available in the wood collection), AB ext, W ext, Ash, Silica, Lignin, Cellulose, Pentosan, Balance. There are 32 trees with more than one set of measurements (from 2 to 17).

Tree sheet
This gathers the results for 1,194 trees. It was built from the Test sheet, by calculating the mean parameter values for each tree.

Species sheet
This gathers the results for 614 species. It was built from the Tree sheet, by cal-culating mean parameter values for each species. Missing density values (not measured in the wood collection) for the species were collected from existing Internet literature (some values are still missing).

Genus sheet
This gathers the results for 358 genera. It was built from the Species sheet, by calculating mean parameter values for each genus.

Family sheet
this gathers the results for 89 families. It was built from the Genus sheet, by calculating mean parameter values for each family. An additional line gives the mean value for gymnosperms represented by only 4 families (8 genera and 15 species).

Global distribution of values
The distribution of values for all measurements (1,287) in the test sheet is presented in figure 1 and figure 2. For all extraneous components (figure 2), the distribution was highly dissymmetrical, with a rather small number of values much higher than the great majority. Total values for these extraneous components were often above 15% (sometimes above 25%). This could introduce bias in the composition of the main cell wall components.
For density, balance, cellulose, lignin and pentosan content (figure 3), the distribution was nearly symmetrical, with a very wide range in density, representative of tropical species in the collection.

Within-tree level
There were very few cases of measurements on different specimens within the tree (always in heartwood). Using those with at least five specimens in one tree (10 trees belonging to 8 species), it was possible to take a look at the within-tree variability of chemistry (figure 1, table I).  The coefficient of variation (CV) was always small (around 5%) for the constitutive polymers (table I) and there were clear differences between trees, even within the same species (4 trees for Terminalia superba).
The CV was rather high (around 25%) for extraneous components, but differences between mean tree values were also rather high, so there were clear differences between trees, even within the same species (figure 1).
For silica content, there was a contrast between so called "siliceous species" of the Dacryodes genus, with values always above 1,500 ppm, and the other species, with values always below 200 ppm and always below 50 ppm for half of the trees. For these "common" trees, the CV was very high (between 50% and 150%) and it was pointless seeking distinctions between trees for this parameter. For the two siliceous trees (17722, 17878), the CV was much lower (around 15%) and it seemed possible, in this small sample, to see clear differences between the two trees.
Overall, for most of the chemical parameters, it was possible to have distinctions between trees (even within the same species), either because of a small CV (main components), or due to large mean per tree differences with a rather high CV (extraneous components).

Within-species level
There were some cases of measurements on different trees within a species. Using those with at least 10 trees by species (12 species belonging to 11 genera), it was possible to take a look at the within-species variability of chemistry ( figure 2, table II).
The coefficient of variation (CV) was rather small (around 10%) for density, but the differences between mean values per species were large. Density was a rather good criterion for species separation. The CV was high (around 40%) for extraneous components and differences between mean species values were high. There were clearly species with low values (lower than 4% for extractives and lower than 1.5% for ashes) for one or other extraneous component. They never seemed to have high values (figure 2). Other species often had very high values for AB extract (above 10%), but the range of values was large and individual trees may have had a low value for this parameter. Although the differences seemed evident with 10 trees per species, using only one tree could lead to a wrong conclusion: a low value was not proof that a mean value would be low, but a high value seemed to be a good indicator of a high mean value. This was also true for water extract and ashes.
For silica content, there was a clear separation between siliceous species (many values over 1,000 ppm) and the common ones (no values over 100 ppm). However, inside the siliceous species, there were large differences between trees and it would be hazardous to make a classification within siliceous species based on only one tree.
For the constitutive polymers, the CV values were still rather weak and differences in mean values per species rather low (table II). In any event, there was a clear distinction between species for lignin content (figure 2).
Overall, for most of the chemical parameters, it was possible to find distinctions between species using 10 trees per species, but using only one tree could make some sense for density, or main component contents (allowing the separation of high, average, or low values species).
For extraneous components, there could be mistakes for low values when using only one tree, because of the very large variation in values between different trees of species with a high mean content.
In the case of silica, the best way was to look first at the anatomy data for the occurrence of specific criteria (Wheeler et al., 1989). If no silica was indicated, the silica content would stay at very low values (usually below 100 ppm). For siliceous species, silica content would be 10 to 100 times greater and deserved to be measured.

Between-species level
Most of the 614 species (67%) were represented by only one tree and 7% had at least five trees. There were only 15 gymnosperm species and very few temperate hardwood species (5).
For the hardwoods, a wide range of densities was covered (table III) and the CV (27%) was very similar to the CV of the wood collection (28%). When compared to this density CV, the CVs for lignin and cellulose content were much lower (14% and 10%, respectively). Extraneous components, such as ethanol/benzene extractives, hot water extractives or ash, had a much higher CV (81%, 60% or 76%). For silica, only 60% of the species were measured and there were huge differences (from less than 100 ppm to more than 10,000 ppm).
For tropical woods, the total content of these extraneous components (figure 1) could reach 20 to 25%. In this case, lignin and cellulose content based on gross wood mass were much smaller than usual and did not give an appropriate representation of cell wall material for most properties.
It was decided to add "relative values" for lignin (Lig rel), pentosan (Pen rel) and cellulose (Cel rel) contents (as a %), which were the values resulting from measurement divided by [100 -AB ext -W ext -Ash] in order to have more comparative main polymer values.
Histograms of mean values at species level (figures 4, 5 and 6) showed two major tendencies: i) a more or less symmetrical distribution, not far from normal, but non-normal all the same, for density and the three main cell wall polymers: lignin, pentosan and cellulose and ii) highly dissymmetrical distributions, with long tails towards higher values, for extraneous components. For each of these extraneous components, there seemed to be a minority of "extract-rich", "ashrich" or "silica-rich" woods composing the high value tails. For the majority of "common woods", estimated medians (a kind of mean value for common wood) could be: 2.1% for AB extract, 2.4% for water extract, 0.7% for ashes and 70 ppm for silica content.
The correlation between parameters was calculated for the mean values at species level (table IV). Relative values are used for the cell wall polymers. Due to the non-normal distribution of values, the Spearman test was used.
There were very significant negative correlations between the three cell wall polymer values. This was not surprising, as increasing the proportion of one of them could mechanically decreases the proportion of the others. A positive correlation of wood density with lignin, but negative with cellulose, could be explained by the small difference in density between these components (the density of cellulose is lower than that of lignin). Positive correlations between density and extractives (mostly AB extract) could be explained by the additional weight provided by heartwood formation.
The highly significant correlation between extraneous components was difficult to interpret without more informa-tion on molecules or minerals and their putative role in the tree (wood protection). There may be some synergy between AB and water extracts (such as hydrolysable tannins), choices between mineral (silica for example) and organic molecules, or simply the solubility of some minerals in hot water.
The highly significant correlation between both the extractive content, and polysaccharide polymers, which was negative with cellulose and positive with pentosan, was puzzling (no correlation with lignin). The number of species is proportional to density (x300 for AB ext and W ext, and x120 for Ash).

Discussion
The total of species measured represent around 10% of species present in the CIRAD wood collection. In some cases, the choice was guided by specific forests where there was a pulp and paper manufacture project (Le Cacheux, 1949;Pétroff, 1976;Tissot, 1989). In other cases it was guided by the necessity to have a large range of densities (Doat, 1977). Globally the necessity to cover a wide range of wood chemical composition was taken into account (Pétroff, 1960). The only way to predict a wide diversity of composition was to choose the widest diversity in density known from the wood collection (Langbour et al., 2018) and to use trees from different countries in the tropical region.
In the described data file we have only the country of provenance but nothing on provenance of the trees. By looking at trees with many tests or species with many trees we can only have a look on variability without being able to know its origin. It is quite obvious that genetics, environment and history are drivers of this intra tree, intra species variability. The answer, which is known in forest sciences, is that we need a minimum of samples by tree and a minimum of trees by species. Even being modest, this means between 10 and 100 times more measurements for these 600 species.
Anyway, the data file shows a very wide range of densities, of geographic provenances, of species, genera and families and looking at 10% of the whole set of species is an acceptable option. Given the rather high total number of tests in these conditions, it is highly likely that the whole range of chemical components contents was more or less covered. But from this data file, apart from a very small number of species, it is impossible to have a view of diversity within the species.
Studying the chemical composition of solid wood for all available species appeared as a challenge between 1930 and 1970, driven by the pulp and paper industry. Pettersen (1984) published a reference paper with numerous chemical composition tables (9 tables of different provenances, both tropical and temperate), at species level, derived either from the literature or unpublished from his laboratory (FPL Madison, USA). In all, 600 species were referenced, including around 100 softwoods and 500 hardwoods (50% temperate, 50% tropical). The CTFT results were not included in the paper. We did not find any available digital data, and it was interesting to enter into a computerized data file the values obtained at the FPL laboratory, always using the same protocol, very similar to that described by Savard et al. (1954), for 51 temperate hardwoods and 36 softwoods (there were no silica content measurements). For hardwoods, 60% of the species were represented by only one tree (40% for softwoods). Only 2% of the hardwoods were represented by at least five trees (30% for softwoods). Table V and figure 3 give descriptive values for tropical hardwoods (594 species from this paper), temperate hardwoods (51 species from FPL in the Pettersen paper) and softwoods (15 species from this paper and 36 species from FPL in the Pettersen paper). The main difference between hardwoods and softwood is the pentosan content. Most hardwood hemicelluloses are composed of 5-carbon sugars, while softwood hemicelluloses have both 5-carbon and 6-carbon sugars (this is why the balance is much lower for softwoods). The mean lignin content was higher for softwoods (29%) than for temperate hardwoods (23%), but it was very similar to the mean value for tropical hardwoods. As for density, the range of values was much larger for tropical than for temperate hardwoods. It was rather common to separate softwoods, temperate and tropical hardwoods (Kollmann and Côté, 1968;Fengel and Wegener, 1984), but it seems very likely that the main difference between temperate and tropical hardwoods was the large difference in diversity. Temperate hardwoods looked like a sub-sample of total hardwoods with a lower range of composition and densities.
The level of chemical description that was chosen by wood laboratories at that period was rather coarse. A more detailed description (Stevanovic and Perrin, 2009) went more or less deeply into i) the proportion of the main three monomers of lignin (Olsson and Salmen, 1997), ii) the proportion of the complex mixture of hemicelluloses (Fagerstedt et al., 2014) derived from different basic sugars (6-carbon sugars, such as mannose or galactose, or 5-carbon sugars, such as xylose or arabinose), iii) the chemical signature of the specific cocktail of extractives (around one hundred per species) consisting of very different organic bio-active molecules, iv) the quantitative distribution of minerals within ash, where calcium, silica or aluminium can be very substantial, sometimes around 1% for hyper accumulators rather than several hundred ppm for most species (Jansen et al., 2002;Gourlay and Grime, 1994;Kukachka and Miller, 1980). This is possible today thanks to new powerful tools (HPLC, GC/MS, LC/ MS, etc.), but it is still rather lengthy, expensive and laborious if applied to thousands of species.
Using such detailed analysis is necessary for a description of wood diversity by chemotaxonomy (Pettersen, 1984;Royer et al., 2010), mostly based on extractives and mineral analyses. It is also useful to study the links between solid wood chemical composition and properties such as shrinkage, vibratory damping (Brémaud et al., 2011(Brémaud et al., , 2013, resistance to insects and decay microorganisms (Neya et al., 2004;Amusant et al., 2014).
The question is: "Do we need to know the chemical composition of solid wood and at what level: wood in general, mean wood for a given forest biome, peculiar chemistry composition for a species or a group of species?" The answer is surely different for industrial purposes and for scientific investigations about relationships between chemical composition and wood properties. The fact remains that detailed or summary information on chemical composition is available today for less than 10% of tree species and this is clearly a serious handicap in making better use of forest biodiversity.

Conclusion
The CTFT chemical composition database for tropical woods provides a relevant view of wood variability at the level of both main and extraneous components in tropical hardwoods, given the wide range of density of species (240 to 1,310 kg/m 3 ), the large number of species (614), genera (358), and families (89) representing respectively 7%, 17% and 39% of species, genera or families present in the CIRAD wood collection.
Values for main components (lignin, pentosan and cellulose) or extraneous components (AB and hot water extracts, ashes) are similar to literature data showing that temperate hardwoods values are included in the larger tropical hardwood range, in the same way as densities. Softwood species clearly differ from hardwoods for pentosan sugars that are less dominant among hemicelluloses in the softwoods.
Variability measured by the coefficient of variation (CV in %) is lower for main components like cellulose (10%), lignin (14%) or pentosan (18%) than for density (27%), but much higher for extraneous components like water extracts (60%), ethanol/benzene extract (81%) or ash content (76%). This means, on a statistical basis, that density should be investigated first before looking at main components influence, but the level of extraneous components should be investigated before density in each case when they are supposed to be active.
Dependency between variations of the parameters, measured by the square of the coefficient of correlation (R² in %), is not strong between density and chemical component content (less than 5%) and within extraneous components (less than 5%). It is not so strong between main and extraneous components (less than 8%) and higher within main components (less than 25%). At the same time the range of values (minimum -maximum) between tropical hardwood species is rather large for lignin content (17 to 41%), pentosan content (6 to 31%) or cellulose content (36 to 61%), using relative values. It is huge for all extraneous components. There is a possibility to make, within the species of this data base, a selection of species with a high or a low value for density or one of the 6 chemical parameters (128 species) together with a small group of "standard" species with values around the median for all characters, in order to investigate the influence of basic chemistry on wood basic properties (shrinkage, mechanical behavior, resistance to fungi…).
Due to the very large variability of extraneous components content, it is impossible to categorize a species with only one tree. There are 235 species with at least 10 trees, each one having at least two duplicate samples in the CIRAD wood collection. It should be possible, using one of these duplicates to have a good chemical description of more than 200 tropical species.
An interesting prospective for further investigation should be to use the duplicate specimens of the CIRAD wood collection (3,460 species representing 10,800 trees have at least two duplicates specimens) for complementary chemical measurements: more species and more detailed investigation within monomers of lignin, hemicelluloses, molecules extracted by solvents or main mineral aggregates.
Apart from giving tools to investigate the role of chemical composition on wood properties, a systematic chemical study of wood extractives is a way to discover a great quantity of active molecules for different sectors as pharmacology, food industry, material improvement or cosmetics.