SIMplyBee: an R package to simulate honeybee populations and breeding programs

Background The Western honeybee is an economically important species globally, but has been experiencing colony losses that lead to economical damage and decreased genetic variability. This situation is spurring additional interest in honeybee breeding and conservation programs. Stochastic simulators are essential tools for rapid and low-cost testing of breeding programs and methods, yet no existing simulator allows for a detailed simulation of honeybee populations. Here we describe SIMplyBee, a holistic simulator of honeybee populations and breeding programs. SIMplyBee is an R package and hence freely available for installation from CRAN http://cran.r-project.org/package=SIMplyBee. Implementation SIMplyBee builds upon the stochastic simulator AlphaSimR that simulates individuals with their corresponding genomes and quantitative genetic values. To enable honeybee-specific simulations, we extended AlphaSimR by developing classes for global simulation parameters, SimParamBee, for a honeybee colony, Colony, and multiple colonies, MultiColony. We also developed functions to address major honeybee specificities: honeybee genome, haplodiploid inheritance, social organisation, complementary sex determination, polyandry, colony events, and quantitative genetics at the individual- and colony-levels. Results We describe its implementation for simulating a honeybee genome, creating a honeybee colony and its members, addressing haplodiploid inheritance and complementary sex determination, simulating colony events, creating and managing multiple colonies at the same time, and obtaining genomic data and honeybee quantitative genetics. Further documentation, available at http://www.SIMplyBee.info, provides details on these operations and describes additional operations related to genomics, quantitative genetics, and other functionalities. Discussion SIMplyBee is a holistic simulator of honeybee populations and breeding programs. It simulates individual honeybees with their genomes, colonies with colony events, and individual- and colony-level genetic and breeding values. Regarding the latter, SIMplyBee takes a user-defined function to combine individual- into colony-level values and hence allows for modeling any type of interaction within a colony. SIMplyBee provides a research platform for testing breeding and conservation strategies and their effect on future genetic gain and genetic variability. Future developments of SIMplyBee will focus on improving the simulation of honeybee genomes, optimizing the simulator’s performance, and including spatial awareness in mating functions and phenotype simulation. We invite the honeybee genetics and breeding community to join us in the future development of SIMplyBee. Supplementary Information The online version contains supplementary material available at 10.1186/s12711-023-00798-y.


Swarming
Swarming is the process through which honeybee colonies produce new colonies. When a honeybee colony outgrows its hive, becomes too congested, or too populated for the queen's pheromones to spread among workers, then the swarming begins. The workers start building swarm cells for new virgin queens. When the queen is ready, she leaves the hive and is followed by about 70% of the workers in a massive cloud of flying bees, the swarm [59]. The swarm will cluster on a nearby tree or bush and remain there until they find a suitable new home.
The virgin queens developing in the old hive are daughters of the queen that swarmed and are attended by the remnant workers that did not leave with the swarm. After few days, the new virgin queens begin to emerge. Typically, the first queen to emerge will kill the rest of virgin queens to assume the role as the new queen for the colony. She will then go on a mating flight to find drones to mate with to begin laying eggs and rebuilding the workforce in the colony [59,60].
In SIMplyBee, function swarm() simulates swarming (Figure 1). The function takes a Colony class object and a percentage p of workers that leave with the swarm. The function returns a list with two Colony class objects, swarm and remnant. The swarm contains the old queen and p percentage of workers that left the hive. The remnant contains the rest of workers (1-p), all the drones, and virgin queens that are daughters of the old queen that swarmed.  We see that the remnant colony does not have a queen but has one virgin queen that will have to be mated. It also has 60 workers since we set p argument to 0.4, meaning that 40% of workers left with the swarm. All the drones remained in the remnant. Note that the swarm status is turned to TRUE.
The swarm contains the old queen, no virgin queens, and 40 workers, since we set the proportion p to 0.4. Same as in the remnant, the swarm status is turned to TRUE in the swarm.
Swarming also turns the production slot to FALSE for both, the swarm and the remnant colony, since we would not be able to collect honey from such colonies (or other products).
The swarm stays genetically identical to the "old" colony, although downsized. Assuming that we have caught the swarm, we assign it back to the original colony object. The remnant has a new queen and is hence genetically different from the original colony. Thus, we assigned it to a new colony object.
colony <-tmp$swarm colony1 <-tmp$remnant After swarming, the colony would usually build-up back to the full size and the virgin queens would mate. Building-up a colony turns the production status back to TRUE. Instead of setting the p every time we call the swarm() function, we can save the swarmP argument in the SimParamBee object. The swarm() function will then use this percentage if p is not set.
The default value is 0.5, but we can set any value we want.
You can also use a non-fixed p parameter by using the function swarmPUnif that samples the p from a uniform distribution between values 0.4 and 0.6 irrespective of the colony strength. You can read more about this in the Sampling functions vignette.

• Swarming a MultiColony
We swarm a MultiColony object in the same was we swarm a single Colony -with the swarm() function. The swarm() function is here applied to each colony in the MultiColony object with the same parameters.
The function now returns a list with two MultiColony objects -one containing the swarms and the other containing the remnants. Above, all the colonies in a MultiColony are swarmed with the same percentage. However, we can also specify a different p for each colony. If we now inspect the first and the second swarm, we see that each colony has a different percentage of workers that stayed or left. If you want to track the genetics, you might want to assign the swarms back to the original apiary and create a new apiary from the remnants. However, if you want to track the position, the remnant actually stay in the original position and would hence be assigned back to the same apiary, while the swarm would be assigned to a new apiary or even be lost. Here, we will track the genetics and assign the swarm back to the original apiary and remnants to a new apiary.

Split
Colony splitting is a common beekeeping technique to limit swarming. A percentage of workers, brood, and food stores are split away to create a new colony or combine two split. Old queen normally stays in the original (remnant) colony. We created a function split() that works on Colony and Multicolony objects ( Figure 2). It accepts the p argument as a proportion of workers that will be split away for a new colony. The output of the function is a list of two Colony or MultiColony objects: remnant that contains the old queen, drones, and (1-p) workers; and split that doesn't contain a queen, but contains virgin queens and p workers. The split() function follows the same principles as the swarm(), hence we will limit explaining the outputs.
• Splitting a Colony   We see that remnant contains the old queen and 70% of workers since we set p argument to 0.3, meaning that 30% of workers are removed in the split. The split status of the remnant colony is turned to TRUE. The production status of the remnant stays TRUE, because in reality we always split a colony in a way that does not threaten the production of the remnant colony.
If we inspect the split, we see that it contains 30% of the workers, the split status is turned to TRUE, and the production status is turned to FALSE, since in reality, these small colonies would not be productive.
We would usually consider the remnant as the original colony, since it tracks its genetics and location. After a split, the colony would build-up back to their full-size.

colony <-buildUp(colony)
The p argument for splitting can be also saved in SP object, so we do not specify it each time we call the function -see SimParamBee$splitP.

• Splitting a MultiColony
We split a MultiColony object in the same way we split a Colony (as shown in swarm). We see that after a supersedure, the old queen is removed and a new virgin queen is ready to mate. Hence, the next step in the simulation would be to cross this virgin queen. We also see that the supersedure status is set to TRUE. The production status stays set to TRUE, since the colony did not loose any individuals (it stayed at it's full-size).

• Superseding a MultiColony
The function supersede() works both on Colony and MultiColony classes. We supersede a MultiColony in a same way as a Colony.

Collapse
Collapse of the colony is a term that describes the death of a colony -when all individuals within a colony die. There are many reasons for the collapse of a honey bee colony: diseases, starvation, queen problems, contamination with pesticides, etc. Colony losses can be high, possibly up to 60% of colonies per year. High colony losses can significantly influence genetic structure of a population and hence genetic diversity and genetic gain.
Function collapse() simulates the collapse of a colony (Figure 4). This function does not remove individuals from the colony, but sets the collapse status of the colony to TRUE and production status to FALSE. This is to mimic reality, where bees would still be present in the colony, although being dead. This allows us to extract any genetic material from the colony even after collapse, say to study genetic causes related to the collapse. Future operations in terms of reproduction or simulation of events are not allowed with a collapsed colony.
collapse() 0 1 nW nD 0 1 nW nD • Collapsing a MultiColony collapse() function is used when you want to keep collapsed colonies for subsequent analyses. If you don't need the collapsed colony, you can also simply select the surviving colonies with the selectColonies() function.