do not necessarily reflect the views of UKDiss.com.
Non-human primates (NHPs) are known to be important reservoirs of diseases that can be pathogenic to humans, and vice versa (Daszak et al., 2000; Leendertz et al., 2006; Liu et al., 2014; Wallis and Lee, 1999; Walsh et al., 2003). One such primate species is Macaca fascicularis, known to carry various diseases, including Plasmodium knowlesi – the fifth malaria (Imai et al., 2014). Currently, P. knowlesi is not fully adapted for human-to-human transmission through an anthropophilic vector, and so traditional malaria control methods have significantly reduced effectiveness in controlling transmission (Imai et al., 2014; Shearer et al., 2016; Viana et al., 2014; Vythilingam et al., 2006). P. knowlesi can be highly pathogenic in humans however, with symptoms akin to Plasmodium falciparum in the severity of infection (Cox-Singh, 2012).
The P. knowlesi malaria parasite currently circulates in non-human populations only, and is thought to be transmitted by primarily exophagic, forest dwelling mosquito species in the Anopheles leucosphyrus group (Barber et al., 2012; Fornace et al., 2016; Imai et al., 2014; Vythilingam, 2010). Zoonotic spillover events result in naturally acquired human infections, and while it is unlikely that humans are a dead-end host, fully-fledged human-to-human transmission does not seem to occur (Barber et al., 2012). “Stuttering chains” are more likely to represent current human-to-human transmission patterns (Viana et al., 2014, p. 270). With increasing incidence of P. knowlesi infections in human populations, the risk of mutation to allow human-to-human transmission via an anthropophilic vector increases (Imai et al., 2014).
Macaca fascicularis is the primary reservoir host for P. knowlesi on Palawan (Imai et al., 2014). 75% of human diseases have a zoonotic origin (Patz et al., 2004), including Plasmodium vivax and P. falciparum (Cox-Singh, 2012; Liu et al., 2014). Cox-Singh (2012) emphasises the importance of surveillance and monitoring to identify host-switch events, and with the uncertainty surrounding the effectiveness of human-to-human transmission, the question of whether spillover events are frequent or infrequent becomes far more important. Understanding the behaviour and ranging patterns of the macaque reservoir is therefore crucial to understanding the potential for transmission of P. knowlesi and other infections to humans.
Both the macaque reservoir and the mosquito vector are required for a spillover event to occur (Moyes et al., 2014). Imai et al. (2014) have linked increasing numbers of cases with deforestation and the disruption of the macaque and mosquito habitats. With increasing road building, forest clearances for farming or livestock grazing and logging practices – be it large scale or for specific trees – forests become more fragmented, the forest edge increases in size, and the deeper less disturbed areas of the forest become more accessible to humans, and the reservoirs and vectors can change the way they use the different forest types (Paige et al., 2016; Patz et al., 2004; Vythilingam, 2010). Various mosquito species have been incriminated as the vector of P. knowlesi, and mosquitoes carrying the pathogen have been identified at sampling sites in habitats likely visited by humans and macaques in Malaysian Borneo, often in habitats not typically associated with that vector species (Vythilingam, 2010).
Fornace et al. (2016) examine the relationship between P. knowlesi infection in Sabah and the proportion of forest surrounding a village, finding that clearance in the previous year was correlated with increased infection, but not clearance that year. This would imply that longer term changes and adaptation to the forest clearance – with some forest regrowth – is important to understanding the changes in macaque movement, mosquito populations and the species interface. In Malaysia, some species in the An. leucosphyrus group have been found away from their usual closed forest habitat, in the forest fragments, where NHPs are increasingly found, and even in villages in the case of Anopheles cracens (Vythilingam, 2010).
Figure 1: showing the distribution of the P. knowlesi reservoir, taken from Moyes et al. (2014)
P. knowlesi infections are prevalent in M. fascicularis populations in multiple South East Asian countries as seen in Figure 1 above (from Moyes et al. 2014), and the first non-human infection in Laos was detected recently (Cox-Singh, 2012; Zhang et al., 2016). In each area the human-macaque-mosquito interface is different, and the local situation must be considered (Paige et al., 2016).
The MONKEYBAR group (LSHTM Malaria Centre) are currently undertaking a long-term study to collect and analyse data relating to the similarities and differences in the natural history of P. knowlesi in Sabah, Malaysia, and the Island of Palawan, The Philippines. Despite being ecologically similar and geographically proximate, there are large numbers of P. knowlesi cases in Sabah, but very few on Palawan. When understanding P. knowlesi risk it is necessary to understand the ecology of the reservoir, vectors, host of interest and their overlap within the environment.
Much of the macaque daily time budget is given over to foraging (Md-Zain, n.d.), and the movement of primates in particular is shaped by preferential return to previously visited sites and heterogeneity of resource distribution (Boyer et al., 2011; Boyer and Walsh, 2010). Macaques typically prefer secondary forest in proximity to rivers and coastline (Fooden, 2006), but with increasing contact with humans, are becoming increasingly gregarious. M. fascicularis are considered to be a weed species – so named because of their ability to adapt to living in close proximity to humans, to flourish in urban environments, and to depend on farmland for a substantial portion of their diet (Richard et al., 1989). Macaques are primarily frugivorous, but in the dry and early wet season when fruit is not abundant, are generally known to focus on other foods as a fall-back (Fooden, 2006). Therefore understanding the distribution of resources in the study site will likely shed light on macaque distributions.
P. knowlesi represented 62% of malaria cases in in 2013 in Sabah, Malaysia, so represents a significant health threat (Fornace et al., 2016). At this time the incidence of P. falciparum was decreasing as the incidence of P. knowlesi increased, implying that different control methods are required for the different strains, presenting a new control challenge for malaria elimination in Malaysia. With increasing forest fragmentation, bringing macaques into closer contact with humans, there are concerns that a similar increase in P. knowlesi cases could be seen in Palawan. Palawan is a very popular destination for domestic and international tourists, and an increase in P. knowlesi cases, particularly if cases are then transported to countries with little experience of malaria, represents a real human health threat. Macaques carry various diseases, including other malaria species with zoonotic potential, and other bacterial and viral pathogens which could be harmful to human health with increased population overlap (Bailey and Mansfield, 2010; Paige et al., 2016; Zhang et al., 2016).
This paper uses data collected by the MONKEYBAR group to examine the macaque-human interface, in order to examine the risk of P. knowlesi to the local human population of Palawan and visiting tourists. Data on the changes in macaque presence/absence with changing human forest use patterns is considered alongside survey data of the human population in the study site, building a picture of the macaque behaviour and movement. Data were collected from three 2km transect walks between February 2013 and May 2014. The transect sites were chosen for their varied environments, including mixed higher elevation secondary forest and forest edge (Site I), undisturbed lower elevation forest (Site II), and forest heavily disturbed by agriculture (Site III). Once a month at each of the three sites, a macaque census was conducted and phenology data collected.
The overall aim of the project is to describe macaque behaviour and movement, with reference to the risk of disease transmission, in particular P. knowlesi.
- Investigate macaque-human interactions: human forest use is likely to be a driver of encounter rates, along with crop raiding events. Human survey data on when and where macaques are seen, along with the macaque behaviours at the time, will provide data on locations where macaque and human areas of activity overlap, and the nature of the interactions.
Interface hypothesis – if macaque-human interactions are generally aggressive, I expect macaque density to increase with distance from roads and houses, and with decreasing levels of human disturbance.
- Explore crop raiding behaviour: using human survey data to identify trends in frequency of crop raiding events, relate this to the availability of fruit and changes in the landscape and macaque densities from the transect walks. Using phenology and weather pattern data it is possible to compare trends in crop raiding and fruit availability.
Crop raiding hypothesis – given reported higher levels of hunting in Palawan than Sabah, I expect crop raiding events to be linked to a seasonal drop in food availability, reducing the cost of potential interactions with humans at farms if food is less available in the forest.
- Examine macaque behaviours in relation to the study site as a whole: using macaque sightings and feeding tree data from the phenology surveys, relate this to patterns of macaque movement in order to more thoroughly build a picture of macaque movement that considers human influence alongside natural heterogeneities in the forest composition.
Primatology hypothesis – macaques are expected to be found more frequently in areas with higher densities of feeding trees, more feeding species available, and higher fruit availability.
The forest composition hypothesis provides essential context for the interface and crop raiding hypotheses; as the interface hypotheses proposes a push factor away from settlements, the crop raiding hypothesis proposes a pull factor towards settlements, and the forest composition hypothesis a pull factor towards the more diverse forest with a higher density of feeding trees. The strength of these influences on the macaques – level of acclimatisation to humans, fruit availability in the forest – will change their behaviour when encountered, the likelihood of proximity to humans, and risk of P. knowlesi spillover events.
Within the Palawan study area, seen below in Figure 1, houses were surveyed and three sites were chosen to conduct the macaque census and investigate forest composition. The sites were chosen for their differences in forest composition and levels of disturbance.
Figure 1: the study site, showing houses, the transect points at each site in black, and the botanic plots in yellow.
A survey of each household in the study site (locations shown in Figure 1) was conducted in November 2014 by MONKEYBAR researchers based at The Royal Institute of Tropical Medicine in Manila. Demographic data and perceptions of macaques was gathered at a household level, data on macaque sightings in the previous four weeks was gathered at an individual level. There are no other similar primates on Palawan, so it is very unlikely that the survey participants gave answers about a species other than Macaca fascicularis, particularly as the local name was also used (Kühl et al., 2008; Meijaard et al., 2011).
Data sets relating to forest composition and macaque presence/absence were received, summarised in Table 1. There was 52 weeks of data that had been collected between February 2013 and May 2014. On occasion sites were inaccessible due to typhoons, and the macaque census delayed by a month. At each site there were 24 censuses; each transect walked in the morning and the evening.
Line and point transect observations were recorded, with methodology (line or point) not identified. GPS points for each observation were taken at the point of detection on each transect, therefore the point transect data should have a GPS point that corresponds to one of the point transect points. The Garmin GPS units used are accurate to 10m (“Garmin – aboutGPS,” n.d.), doubled for the difference between two points, and doubled again for inaccuracies introduced by forest cover and elevation changes. I am therefore assuming that all of the presence points within 40m of a recorded point transect point were detected using the point transect methodology, rather than opportunistic line sampling. The data are not comparable due to different sampling efforts and detectabilities (Buckland and Handel, 2006; Kühl et al., 2008; Thomas et al., 2010). This is an important assumption and limitation of the data (see Appendix 1 for a further explanation of line and point transect methodologies).
Data on forest disturbance and type was collected at 20m intervals along each transect. This was separated into ‘Disturbed’, ‘Moderately Disturbed’ and ‘Undisturbed’ forest (Appendix 1, Table 1). Bishop et al. (1981) categorise disturbance of habitat using home range, level of harassment, habituation, and presence of predators. We are unable to define home range, however on Palawan macaques are only predated by humans, and with increasing human encroachment on the forest and lack of macaque habituation to humans (PM Kim 2017, personal communication, …) we can assume as forest disturbance increases, the negative impact on macaques increases (Richard et al., 1989). Forest was therefore classified as disturbed if there was evidence of human activity, and moderate if no overt evidence of human activity, but was still described as disturbed by the field team. Once visualised in ArcGIS, these classifications corresponded well with the descriptions of the sites given by the field team.
As seen in Figure 1, each site had 6 transects, 1km in length, spaced 200m apart, then divided in 200m segments – forming a grid of 36 points 200m apart. These points were used to conduct the point transect macaque census. Botanic plots were used to gather forest composition data at the beginning of the study. The plots have a width of 10m either side of the transect, and are 100m in length. Some transects in sites II and III have more than one plot, as it was randomised. Plots in site I were chosen for a good tree density, so this may overestimate the tree density. Sites II and III were randomised, then the locations the team were able to access were selected.
- Site I: high elevation that is undisturbed, then slopes down towards the low elevation near the road in the north of the site, seen in Figure 2. The north of the site is more disturbed, but not farmed.
Figure 2: site I, showing the level of forest disturbance at 20m intervals, locations of houses, and macaque sightings
- Site II: low elevation coastal site, largely undisturbed as seen by the predominantly green colour of the transects in Figure 3. Site II is completely inaccessible other than by boat and the entrance to the site has a very small community of fishermen, shown in the pink marker for a house. This area of the site is the area with most disturbance.
Figure 3: showing site II, with levels of disturbance at 20m intervals shown, locations of houses, and macaque sightings
- Site III: heavily disturbed site as seen in Figure 4. The trees were taken for charcoal making at the start of transects 3, 4, 5 and 6, and then replanted. The site is mostly agricultural land, which is replanted with a range of crops, including corn. This site contains some bamboo forest. The north-west of the site is the most heavily disturbed part of all the study sites, being heavily logged, replanted and farmed.
Figure 4: showing site III, with level of forest disturbance at 20m intervals, house locations, and macaque sightings
Data on trees present in each botanic plot at the beginning of the study was cleaned, and tree density and feeding tree density was calculated for each plot (plot locations are shown in yellow in Figure 1). A combined list of suspected and verified feeding trees was used to define feeding species. Plots were assigned a forest disturbance level based on maximum disturbance. The plot data were then used to describe the study area, and determine whether the forest type varied between site, and whether the same forest classification between sites had the same composition. Plots were assessed for abundance/presence of tree species (availability) and for the number of trees in a plot (density).
In order to combine the botanic plot (tree density etc.) data with the macaque census data, the plot specific data was entered where a transect point was found inside a plot. Where transect points did not overlap with a plot, the average for that forest type, within that site was calculated and used.
A monthly phenology survey considered flowers, young leaves, unripe fruits, ripe fruits and vines. Each tree was classified as having none, few, some, or many. The survey also recorded instances of tree death through the study. All trees with a Diameter at Breast Height (DBH) of 10cm or above were recorded and used in the monthly phenology surveys. Where possible, the same person did the phenology plots, to try and eliminate observer/recorder bias between surveys. The data for each plot were combined by month to give an average proportion of trees in each site with no, few, some and many ripe and unripe fruits. The data on young leaves, flowers and vines were not used.
When relating the phenology data to the macaque census data, monthly values were used for the transect points that overlapped with the plots. The data on ripe and unripe fruit was therefore month and plot specific.
Preparation for multivariable analysis
I intended to use the Distance Sampling programme to calculate the density of macaques in each forest type (Buckland et al., n.d.; Buckland and Handel, 2006; Thomas et al., 2010). With 52 observations however, there were insufficient presence points for this methodology to work for point transect data (Kühl et al., 2008). Therefore, logistic regression was used to calculate odds ratios for macaque sightings across the different variables.
Data were collated in order to conduct further analyses. Transect points form the data structure; at each transect point, for each census, there is either a presence or an absence point, and environmental variables for each transect point. Distance to the nearest house (gathered from human survey data) was plotted in ArcGIS and extracted using the near tool for point data. The distance to the road was extracted from a raster data set, using the ‘near’ tool.
Plots are situated at various points within the transects, however there are many transect points without a corresponding plot. In order to determine forest composition data for each transect point, the phenology and botanic plot data were averaged by forest type within a site. Where there was a point transect within a plot, the specific plot data was used, rather than an average.
Bivariate analyses were performed to determine which variables to enter into the model using Chi-square and independent sample t-tests, and the variables identified were then assessed for multicollinearity using multiple linear regression.
Logistic regression analyses were performed in R to identify characteristics of the study site that were associated with the presence or absence of macaques.
To investigate the interface, crop raiding and primatology hypotheses, the results from the human survey, macaque census, and phenology surveys are considered below.
Between the 11th and 14th November 2014, households within the study site were asked whether macaques they had seen in the past four weeks. Out of 489 respondents, 120 had seen macaques in the past four weeks, adding to a total of 523 macaque observations between October and November 12th 2014. This data was used to investigate the interface and crop raiding hypotheses.
Table …: locations of the macaque encounters, as reported in the human survey
As seen in Table … , the vast majority of encounters happened in the forest, with the next largest proportion of encounters at farms. Of the people who saw macaques, 13.4% were farmers, 3.5% drivers, 6.4% housewives, 11.3% students, 5% fishermen, 4.3% worked in construction, 5% sawali/buho gathering, 5.7% made charcoal, with 46.8% reporting ‘other’. Of the famers who saw macaques, 20% reported some
Table …: showing the ways the survey respondents reported using the forest, includes people who have and have not seen macaques
|Forest use||Wood||Hunting||Food||Do not use||Other|
Using the forest for wood is the most common reported reason for entering the forest, apart from ‘other’. Only 2.5% report using the forest to hunt, and hunting is now reported to be infrequent, although it was common in previous years (PM Kim, personal communication, …). The knowledge of how to hunt is still alive, but there is one instance of hunting recorded in the macaque census notes and three survey respondents reported hunting, so there is evidence that some hunting still occurs at the site.
The crop raiding and primatology hypotheses – that crop raiding will be linked to seasonal fruit availability, and that macaques will prefer areas with denser forest and higher availability of feeding species – require information on macaque movement and detailed data about the forest composition of the three sites.
In order to maintain sample size, I included both auditory and visual detections of the macaques. It is highly likely that some auditory detections are false positives, particularly where the detection is based upon moving branches and the sound of an animal moving through the canopy. While experienced fieldworkers can often detect the type of animal that is present from the sound of the movement, even the most experienced trackers can be incorrect. Where the auditory detections are based upon calls, this is likely to be more reliable, however there are a number of bird species which can mimic macaques. At the study site the Ashy Fronted Drongo and the Spangled Drongo are reported to mimic macaques with high fidelity, and it is possible that the Hill Myna may also mimic macaque calls, this is not yet documented at the study site however (PM Kim 2017, personal communication, ).
- Site I: 17 of 36 grid points have macaque sightings. There were 28 total macaque sightings across all of the censuses and transects at Site I, and 11 of these were auditory detections (39.3%).
- Site II: 12 transect points have macaque sightings. There were 19 total macaque sightings across all of the censuses and transects at Site II, and five of these were auditory detection only (26.3%).
- Site III: five of the transect points have macaque sightings. There were five total macaque sightings across all of the censuses and transects at Site III, three of these were auditory detections (60%).
Thus far, the data supports the hypothesis that macaques preferentially spend time in undisturbed places that are far from humans. There is a clear trend in the observation data and number of macaques, as seen in Tables 5 and 6 below. The average group size is higher in disturbed areas than in moderately disturbed or undisturbed areas, but were this a true effect we would expect higher group sizes in site III than are seen. The apparent relationship between group size and forest type may be caused by a lower tree density in disturbed areas, resulting in higher detectability of the macaques.
Table 5: macaque observations, stratified by site
|Macaques by site||Site 1||Site 2||Site 3|
|Number of observations||28||19||5|
|Number of macaques||111||105||11|
|Estimated size of troop||3.9||5.5||2.2|
Table 6: macaque observations, stratified by forest type
|Macaques by forest type||Disturbed||Moderately||Undisturbed|
|Number of observations||6||12||34|
|Number of macaques||47||45||179|
|Estimated size of troop||7.83||3.75||5.26|
These results seem to correspond with the reports from the field team that the macaques produced predator calls when people were seen, and consistently ran away from people at all sites. There appears to be no habituation, even in the disturbed sites, shown by the lack of any evidence of macaques in the disturbed areas of site III (PM Kim 2017, personal communication, ).
It is important to note that the number of the tree species at the study site is exceptionally high. At all sites the number of tree species and families continued to increase with each botanic plot. It is likely that had more plots been done, further species and families would have been identified. As in focus groups, we would ideally continue to sample until we no longer received new species. Therefore despite having a significant proportion of the tree species, we may not have a complete picture of the species and diversity at and between each site.
This project examines hypotheses which require an understanding of the forest composition, macaque, and human behaviour. In order to combine these varying sources and carry out further analyses, the macaque census and forest composition data must be combined into one data set. The following section examines the differences in the sites and in the disturbance levels, in order to understand the heterogeneities between the sites and decide on the best way to split and group the forest composition data.
We expected evidence of difference in fruit abundance between the forest types, however there was no evidence against the null hypothesis of no difference in the mean proportion of trees with ripe fruit between the forest types (F = 0.632, p = 0.539). Similarly there was no evidence against the null when examining the proportion of trees with unripe fruits (F = 1.362, p=0.273).
Proportions of trees with few, some and many ripe and unripe fruits were plotted by month (Appendix 2, graphs 1 to 6). Despite the proportion with any fruit being very low, the graphs showed clear seasonal trends. Therefore when comparing the phenology data with macaque sightings, seasonality should be taken into account. It would not be appropriate to average the phenology data across time.
In order to investigate the primatology hypothesis, it was necessary to determine whether there was a true difference in the disturbance levels and forest composition between the sites, the tree species, tree density, and disturbance levels between the sites.
There are 150 unique tree species identified in the study, heterogeneously distributed through the sites with all sites sharing 30 common species, descriptive statistics shown below in Table 2. The sites can be differentiated when looking at forest type and abundance of feeding trees within each site. Overall sites I and II were more similar to each other than to site III.
Table …: descriptive statistics by site, only the data estimated to be point transect data was used for the macaque detections
|Measure||Site I||Site II||Site III|
|Tree density (per hectare)||804||1000||834|
|No. of species (total)||85||69||98|
|No. of feeding species unique to the site||20||18||40|
|No. of feeding trees (combined)||436||545||409|
|Prop. of feeding species available at site – total (verified, suspected)||0.7568
|Disturbed forest classification (%)||14.29||0||85.71|
|Moderately disturbed (%)||54.55||25||14.29|
|Feeding tree density (/ht) Combined (472)||436||545||409|
|Feeding tree density (/ht) Verified (250)||284||150||314|
|Feeding tree density (/ht) Suspected (223)||152||395||95|
Overall tree density
There is suggestive evidence of difference in tree densities between the sites (p = 0.071), as expected as the sites were picked based upon their suspected differences in forest composition and level of human disturbance. This is likely to be linked to forest disturbance level, which is observed to differ widely between the sites.
Availability of all species
There is no evidence of difference in the number of species available between the sites (F=1.781, p=0.188), possibly reflecting the high, and still increasing, number of species found overall in the study area. Using the Chi-square statistic, there are more unique species than expected in site III, and fewer species are shared with the other sites than is expected by chance. There was no evidence of a difference between the numbers of species between sites I and II.
Feeding tree density
There is strong evidence of an overall difference in the mean number of feeding trees (shown in Table 3) found between the sites (F=7.836, p < 0.01), with site II showing strong evidence of difference from site III (p=0.002), and suggestive evidence of difference from site I (p=0.035). There was no evidence of difference between sites I and III (p=0.462). The means are consistent with the hypothesis that the less disturbed site would have a higher density of feeding trees than the more disturbed sites I and III. Under the primatology hypothesis we would expect site II to therefore have more macaque sightings.
Availability of feeding species
When examining the proportion of feeding species available at all three sites, the proportion of total feeding species was 0.46, verified species was 0.57 and suspected species was 0.39. Verified and suspected feeding trees have been delineated due to the differences in sampling style – the suspected species are the species that the local population have reported seeing the macaques eat. The local population will likely see the macaques more frequently close to areas that they use, and likely in a certain forest type resulting in biased sampling.
Site II appears to have the lowest proportion of feeding trees, yet has the highest density of feeding trees, reflecting the overall higher tree density in site II (seen in Table 3 above). There is no evidence of difference between the number of feeding species present between sites I and II (p=0.927), however there is strong evidence that site III differs from sites I and II (p = 0.005 and p = 0.011, respectively).
The forest type shows strong evidence of difference between the sites (percentages above in Table 2). This result suggests that forest type and disturbance level will be an important factor when looking at macaque densities, and differences between the forest types should be investigated further.
Examining the forest characteristics by disturbance level helps to remove the confounding effect of the multiple different disturbance levels within one site. When combining the data sets in order to perform multivariable analyses, the analyses below show that it will be more appropriate to split the study site by disturbance level, rather than by site.
The undisturbed forest type is distinct in terms of the overall tree density and the density of feeding trees. More often than not the disturbed and moderately disturbed categories cannot be distinguished statistically. The number of feeding species (availability) shows suggestive evidence of a difference between disturbed and moderately disturbed areas, but no evidence of difference between the moderate and undisturbed. Table 4 summarises the key forest characteristics for each disturbance type.
Table 4: descriptive statistics of the forest composition, stratified by forest type
|Number of plots||7||11||12|
|Availability of all species||91||93||94|
|Average numbers of trees
(720 per hectare)
(796 per hectare)
(1048 per hectare)
|Availability of feeding species (percentage of combined list)||27
|Density of feeding species
Overall tree density
There is strong evidence of a difference in mean number of trees (density) between the forest types (F = 6.113, p = 0.006; disturbed = 72.00 (50.96, 93.04), moderate = 79.64 (66.23, 93.05), undisturbed = 104.83 (89.90, 119.77)). Pairwise comparisons show no evidence of difference in tree density between the disturbed and moderately disturbed areas (p = 0.854), but strong evidence of a difference between disturbed and undisturbed areas (p = 0.012), and quite strong evidence of a difference between moderately disturbed and undisturbed areas (p = 0.032). The means show greater difference and higher significance levels than when compared by site.
Availability of tree species (count)
Similarly to the comparison between sites, there is no evidence to reject the null hypothesis of no difference in the number of species found in each forest type (disturbed = 25.00 (19.43, 30.57), moderate = 24.82 (22.49, 27.14), undisturbed = 26.67 (21.87, 31.46)).
Feeding tree density
We expected the density of feeding species to be different between the forest types, and there was very strong evidence of a difference (F = 16.548, p = 0.00002; disturbed = 31.57 (20.21, 42.93), moderate = 49.73 (38.20, 61.26), undisturbed = 75.33 (63.89, 86.77)). Pairwise comparisons showed suggestive evidence against the null hypothesis of no difference in the mean feeding tree density between disturbed and moderately disturbed areas (p = 0.09). There is quite strong evidence against the null of no difference between undisturbed areas and moderate areas (p = 0.003) and very strong evidence of difference between the undisturbed and the disturbed areas (p = 0.000021).
Availability of feeding trees (count)
The count of feeding species was expected to show a similar pattern between the forest types as between the sites, however the forest types may contain very different species of trees. There is strong evidence against the null hypothesis of no difference between the mean number of feeding species between each forest type (F = 7.167, p = 0.003; disturbed = 9.14 (7.50, 10.78), moderate = 12.55 (10.66, 14.43), undisturbed = 14.58 (12.25, 16.92). Using pairwise comparisons shows suggestive evidence of a difference in number of feeding species between the disturbed and moderate areas (p = 0.079), and strong evidence of a difference between the disturbed and undisturbed areas (p = 0.002). There is no evidence of difference between the moderate and the undisturbed areas (p = 0.307).
Overall, the comparison of forest characteristics between the sites and between forest types show that forest type is a better representation of the forest composition than site. Further analyses will focus on forest type when grouping plot data, rather than grouping by site.
Bivariate analyses were performed to determine which variables to enter into the model. A chi-square statistic was calculated to examine the relationship between the sightings and the categorical forest composition. The Pearson Chi-Square showed strong evidence of a difference in the number of observed and expected sightings between the forest composition categories (12.845, p = 0.002), with more sightings than expected in undisturbed forest, and fewer in disturbed forest. Observed and expected sightings in moderately disturbed forest were roughly equal. There was quite strong evidence of a difference between the number of sightings in the morning and in the afternoon censuses (Pearson Chi-square = 5.011, p = 0.025), with more observations than expected in the morning censuses, and fewer than expected in the afternoon.
Independent sample t-tests were used to examine the relationship between the categorical explanatory variables and presence/absence of macaques, using Levene’s statistic to test equality of variances. Reported in Table 7, overall tree density showed suggestive evidence of a difference in the mean number of trees (p = 0.072), with the no sighting group showing 818.4 trees per hectare, and the sighting groups showing 881.23. Availability of feeding species (count) showed very strong evidence of a difference between the sighting and no-sighting categories (p<0.001), with a higher number of species available in the areas where there were sightings. Distance to the nearest house showed strong evidence of difference between the categories, with a transect points with sightings on average 230m further from houses than transect points without sightings. The number of trees with fruit, given in percentage and as a count to control for number of trees within a plot, showed no evidence of difference between the points with sightings and the points without sightings.
Table 7: showing results of the independent samples t-test between macaque presences and absences
|Variable||t Statistic||P-value||Macaque presence||Macaque absence|
|Tree density (/ha)||t = -1.839||0.072||881.23||818.4|
|Feeding tree density (/ha)||t = -1.21||0.226||564.5||526.1|
|Feeding species count (number of species /plot)||t = -3.605||<0.001||13.59||12.02|
|Distance to nearest house (m)||t = -4.945||<0.00001||957.57||728.06|
|Distance to the road (m)||t = 0.465||0.642||1272.14||1327.72|
|Percentage of trees with fruit (/plot)||t = -0.504||0.622||4.58||4.24|
|Number of trees with fruit (/plot)||t = 0.375||0.712||41.72||44.74|
After bivariate analyses, the explanatory variables to be entered into the model include: categorical disturbance level, time of day of the census, tree density, availability of feeding species and distance to the nearest house. Multicollinearity is suspected, particularly between feeding tree availability and overall tree density. To test for this, multiple linear regression was conducted in SPSS, to allow inspection of the variance inflation factors, condition index and variance proportions.
Multiple linear regression and correlation matrix was used to check for multicollinearity between the continuous variables. When predicting feeding species count, there was some evidence of multicollinearity between overall density of trees and the disturbance level, using the variance inflation factor (VIF). The condition index showed no multicollinearity however, and the variance proportions indicated some evidence of multicollinearity between the availability of feeding trees and the overall tree density, but not between disturbance and feeding species count. Variance inflation factor indicated multicollinearity between disturbance, tree density and distance to the nearest house, however there were no other indications of multicollinearity between distance to the nearest house and the other explanatory variables. The only indication of multicollinearity when using linear regression to predict tree density was a variance proportion of 0.94 for feeding species count. Overall it seems likely that there will be some multicollinearity between the explanatory variables, and the model may benefit from the removal of one or more variables.
Once identified, multicollinearity was corrected for by excluding tree density from the analysis as there was only suggestive evidence that tree density differed between the areas with sightings and those without (Midi et al., 2010). Once tree density had been removed, the linear regression models no longer showed evidence of multicollinearity. The variance proportion between distance to nearest house and feeding species was 0.96, however with no other indications of multicollinearity, this is not a problematic level for subsequent analyses.
A ratio of two absences to one presence was chosen for the undersampling, as this provided relatively consistent results, balanced sensitivity and specificity, and did not classify all as no-sighting when running logistic regression models. ROC statistics were relatively consistent throughout the models.
Barbet-Massin et al. (2012) found that when using generalised linear modelling (GLM) techniques, the best sampling methodology for under-sampling the absence data is random sampling. They recommend using a large number of absences with equal weighting for presences and absences. Logistic regression was performed in R using the ‘glm’ function to calculate the coefficients, and the odds calculated by taking the exponential of the coefficients. The coefficients, significance levels and standard errors are reported in Table 8 below.
Given the low macaque densities and detection rates in the study site, 98% of the data is classified as ‘no sighting’. When a binary logistic regression is performed in R, the model is based upon highest accuracy, this however, means that 100% of the data is predicted as ‘no sighting’, as the model thus predicts 98% of the data correctly, thus having a specificity of 100%, and sensitivity of 0%.
Only the time of day and distance from the nearest house showed evidence of significance. According to this model, the odds of seeing a macaque in the afternoon are 0.52 times that of seeing one in the morning, a sighting is 2.12 times more likely in moderately disturbed compared to disturbed areas, and 2 times more likely in undisturbed compared to disturbed. With each feeding species available in the area, the odds of a sighting are increased by 4%, and with each metre from the nearest house, the odds of seeing a macaque increase by 0.2%. Only the time of day and distance from the nearest house showed evidence of significance.
Table 8: results from the undersampled bootstrapped models, coefficients, significance levels, and standard error of the coefficient in brackets
|Model||AM/PM||Moderately disturbed||Undisturbed||Feeding species count||Distance to nearest house|
|All absences||-0.654; p < 0.05
|0.0016; p < 0.001
|Bootstrapped (1000, 4:1)||-0.381
|Bootstrapped (1000, 4:1)||-0.304
Due to the classification problems, undersampling the no-sighting data with subsequent resampling and bootstrapping was attempted. Different numbers of absences were trialled (results in Table 1, Appendix 2). Using the ‘ROSE’ package in R, absence data was subselected, using under-sampling methods. An absence:presence ratio of 10:1 was tested, however the logistic regression model classified all data as absence data, so a further reduction in the number of absences sampled was required. Absence data points were undersampled using the ‘ROSE’ package, and a bootstrapped logistic regression was performed in R using the GLM function, with 1000 replications.
The two bootstrapped models indicate that the odds of seeing a macaque are 1.66 and 1.73 times higher in moderately disturbed areas compared to disturbed areas, and 1.99 and 1.75 times higher in undisturbed areas than disturbed areas. Each feeding species increases the odds of a sighting by 5.8% and 6.6%, and with each metre away from the nearest house the odds of a sighting increase by 1.1% and 0.1%. These results show roughly the same patterns, however the standard errors are reasonably large in comparison to the coefficients, implying that the estimations have low accuracy.