Introduction
Advanced population genetics started with the works of Wright, Fisher, and Haldane, and the fUndamental genetic models were presented and investigated by Fisher, Wright, and Kimura [1-4]. Haldane merged Mendel’s theory of inheritance and Darwin’s theory of evolution to understand how mutation, selection, and random genetic drift could affect the evolutionary mechanisms and also, to emphasize that natural selection could act in a Mendelian structure such that Darwinism and Mendelism were consistent [5]. Wright [6,7] and Maruyama [5] studied the distribution of gene frequencies using different genetic models and proposed the model for studying the structured populations. Wright [6-8] also investigated the effect of migration, mutation, and selection in changes in gene frequency with population size. He explained that the distribution of gene frequency without selection, migration, and mutation would be Unchanged. Kimura [9] analyzed the probability of fixation of genes.
Wright [6,8] derived the stationary distributions and studied the gene frequency distribution under mutation occurrence, but Kimura [9] and Maruyama [5] proposed a model for studying the gene frequency distribution without mutation. In a large population, the frequency of favorite genes increases by selection and is finally fixed. But, if the size of the population is small, one gene can be fixed randomly [10]. In cases where the genotype of heterozygote shows a selective advantage over homozygotes, the genetic variation will maintain in a population that is very important for quantitative and eco-evolutionary genetics studies (Robertson, 1960) [10]. The heterozygote advantage has considerable effects on biodiversity conservation, and the study and formulization of diffusion of the heterozygotes are important to deduce its effects [11].
The most important traits in medicine, agriculture, and biology have a complex dynamic genetic process. This process is under the control of many structural and regulatory genes and is also influenced by environmental factors [12]. In such traits, linkage disequilibrium and also systematic genetic factors such as mutation, migration, and multilevel selection affect the frequency of genes and therefore affect the results of genetic studies [1].
Additive and non-additive gene effects are the causes of genetic variability in quantitative traits. Suppose the combined effects of the alleles are equal to the sum of their individual effects of them. In that case, the underlying gene effect will be additive effects, and the value of the heterozygote will be the intermediate of the homozygotes which is simply referred to as the heterozygote intermediate. But non-additive effects appear from both dominances, i.e., the interaction between the alleles within a locus, and epistasis, i.e. the interactions between alleles of different loci [13].
The narrow-sense heritability as a commonly used term to describe properties of quantitative includes only the additive effects of variation, and therefore represents the fixable part of the genetic variance and transmitted to next-generation [13]. In genomic prediction study, the additive effects of genes accoUnt for mathematical modeling of the infinitesimal genetic structures. In population and commUnity genetics, a suitable equation is necessary for modeling the mathematical problems of the diffusion approximation of the gene frequency in a random environment.
Diffusion under additive effects has already been studied by Jensen [14], Nagylaki [15] and Bürger and Ewens [16]. Fisher [17,18] and Wright [8] pioneered the use of diffusion approximations for the study of gene frequencies, with emphasis on the flux of mutations. Wright [6,7] also used diffusion approximations to study the fixation of beneficial alleles. Diffusion approximations have also been widely used to study the random variation of selection coefficients ) [19]. The first approximation of the stochastic solutions could be the deterministic model of fixed environment equations. The environment could be assessed as a continuous or discrete variable. In modern population genetics, the geographical and spatial aspects of the diffusion of the gene frequencies and their influences on the evolutionary dynamics are considered and investigated [20,21].
A variable is called stochastic if its values change randomly over discrete or continuous time. If a process takes a different range of values, it is called diffusion or continuous process [21]. The Wiener process which is a real-valued continuous-time random process is a fUndamental device for limiting theorems. It was presented as a natural and mathematical model of Brownian motion. The mean square derivative of Brownian motion or the Wiener process is referred to as white noise and if the probability distribution of white noise is Gaussian, then it is called Gaussian white noise [22]. Figure 1 shows a standard Wiener process that was created by Maple 18.01 software.
Figure 1: A standard Wiener process. Commands and parameters in Maple 18.01: W := Wiener Process( ); P := Path Plot (W(t), t = 0 .. 10, timesteps = 40, replications = 30, thickness = 2, color = black, axes = BOXED, gridlines = true);P;
The basic model of Brownian motion is a one-dimensional random walk. A random walk is a process like genetic drift or Brownian motion, including a succession of changes in their directions and sizes governed by chance [23].
While Brownian motion is a continuous-time and continuous-space model, random walk is a discrete-time and discrete-space one. Brownian motion gives rise to the particles being in constant motion causing more stability. If the future of a process depends on its present status, then this process is called a Markovian process or Markov chain. An ordinary example of the Markov process is the inbreeding mating method [24]. Also, according to Mendelian inheritance, changing the frequency of an allele from one generation to the next one is Markovian [9]. If the expected value of the future of a process is equal to its present value, the process is called a martingale process. Brownian motion can be described as a martingale, a Markov process, or a normal process. The consequence of a random walk can be merged into a Brownian motion; also, the fractional Brownian motion has great applications in forecasting problems [25]. Also, Wang [26] studied the fractional Brownian motion with random diffusivity and addressed the case of non-Gaussian anomalous diffusion in terms of a random-diffusivity mechanism in the presence of power-law correlated fractional Gaussian noise. Chertzvy et al. [27] foUnd that for massive particles performing fractional Brownian motion, inertial effects not only destroyed the stylized fact of the equivalence of the ensemble-averaged mean-squared displacement to the time-averaged mean-squared displacement of overdamped or massless fractional Brownian motion but also dramatically altered the values of the ergodicity-breaking parameter.
birth-and-death processes, as well as modelling of natural catastrophic events, have recently been reincarnated in terms of various resetting models, and within these restart-based models the processes of fractional and geometric Brownian motion (applicable to the multiplicative growth in the population dynamics). Wang, et al. [28] foUnd, inter alia, that the resetting dynamics of originally ergodic fractional Brownian motion for superdiffusive Hurst exponents developed disparities in scaling and magnitude of the mean-squared displacements and mean time-averaged mean-squared displacements indicating weak ergodicity breaking. On The other hand, Vinod, et al. [29] derived the ensemble and time-averaged mean-squared displacements for Poisson-reset geometric Brownian motion.
The random fluctuations of the environmental variables can change the frequency of genes and genotypes. These variables like temperature, humidity, pressure, light, elements of the air and soil, chemical materials, and other Unknown factors accompanied by the diffusion of the population in the habitat produce random white noise. These random factors of the habitat may trigger the expression of duplicated genes and change the gene frequencies, so that these genes produce one protein or two very similar proteins, i.e. two identical phenotypes that are recognizable by the electrophoresis methods [30] and therefore present white noise. In addition, neutral mutations influenced by the random factors of the environment exhibit similar phenotypes and produce white noise too.
Kimura [9,31] had the largest contribution to diffusion approximations in the study of gene frequencies and provided a solution to the stochastic processes in genetic models based on the Kolmogorov backward equation for the fixation of gene frequencies. The analysis of the Kolmogorov-Petrovskii-Piscuinov (KPP) equation of Brownian motion that could be applied to genetic models including genes with additive effects was reported by Mueller and Sowers [32]. Beforehand, McKean [33] applied this equation in Brownian motion to analyze the genetic models with additive gene effects. Benth, Deck, and Potthoff [34] also analyzed the Cauchy problems for some non-linear stochastic equations with white noise. Malliavin [35] proposed the stochastic analysis of Wiener fUnctionals, arising from the solutions of stochastic differential equations. A differential equation in that some of its terms are stochastic processes is called a stochastic differential equation; also, the solution of a stochastic differential equation is a stochastic process.
White [36] assessed the systems of interacting species living in a fluctuating random environment with white noise. Considering Gaussian diffusion in the Wright-Fisher genetic model with additive gene effects, Norman [37] investigated the random fluctuations in gene frequencies under mutation, selection, and random genetic drift. Moreover, the random genetic drift in a diffusion context has also been studied widely by Ethier and Nagylaki [38]. Illner and Wick [39] also studied statistics and measure-valued solutions for some genetic models with additive gene effects describing super-Brownian motion.
An attractive case is when the heterozygote genotypes present a selective advantage over other genotypes [10]. The heterozygote advantage has significant effects on biodiversity for the preservation of genetic variation in population and on plant and animal breeding programs in developing superior hybrid genotypes. Therefore, the mathematical modeling of this biological mechanism is important in eco-evolutionary dynamic studies and genetics investigations. The aims of the present study were: i) to study the diffusion approximation of the gene frequency based on the Haldane genetic model under the conditions of heterozygote intermediate or additive gene effects in a birth and death process in a random environment under a systematic process such as mutation and selection, ii) to evaluate the gene-environment interactions under Brownian motion model and iii) to include Brownian motion and connect it to the genetic model to mimic random drift.
Materials and methods
Haldane genetic model
To drive Haldane genetic model, first the Wright-Fisher genetic model with assumptions of monoecious diploid population, diallelic locus, and non-overlapping generations was considered [37]. I considered a single autosomal locus with alleles, B1 and B2, and allele frequencies of v(t,x) under the conditions of additive gene effects. The alleles were transferred independently dynamic from one generation to the next based on Hardy-Weinberg law with the birth and death Markov process [32,37]. With two alleles, the population consisted of three genotypes; B1B1 (dominant homozygote genotype), B1B2 (heterozygote genotype), and B2B2 (recessive homozygote genotype). In a diploid population with N individuals, there will be 2N genes [40]. It is important to mention that this genetic model can be generalized to the dioecious population. For more details, refer to Norman [37].
It was assumed that two genotypes were selected with replacement and randomly at each occurrence. Since some of the mutations are lethal, the first individual dies and replaces by chance by another one whose kind relies on that of the second selected. So that, Wright-Fisher genetic model like Moran genetic model [41] is a birth and death process, and has resemblances to the Bernoulli-Laplace model, and Ehrenfest model [36,42]. Moran genetic model is a variant of the Wright-Fisher genetic model (Hofrichter, Jost, [1] except that the Moran genetic model does not contain fitness ) [39,42]. As a result, Haldane genetic model is the limit of the Wright-Fisher genetic model (see Lemma 1). The linkage was not considered Unless the complete crossing-over took place.
Genetic variables
It was supposed that vn was the relative frequency of the B1 gene in the group of matures of generation n. Also, It was assumed that the random variable vnwas measurable for the time-dependent ƱnN = σ-field ƱnN , and the random variable v was measurable for the time and space-dependent Ʊn = σ-field Ʊn. Here, ƱnN was the σ-field generated by Un , Un-1 ,…, U0 where Un was the Markov process, and {Un , n ≥ 0} was a martingale. Also, Ʊn =
where n = t was integer time, n ≥ 0, , A was a Borel-measurable subset of R, z(A) was the independent Brownian motion and
was the Borel-
algebra [32,37].
I considered three genotypes, B1B1, B1B2, and B2B2 with relative fitness of w1, w2, and w3, respectively under random mating, after one generation selection, the relative frequency of genotypes is proportional to w1vn2, 2vn(1- vn)w2, and (1- vn)2w3, respectively (Norman, 1975). So, the expected B1 gene frequency after one generation selection is
Where, the denominator is the average fitness [43]. If a B1 gene mutates to a B2 gene with rate γ1, and a B2 gene mutates to a B1 gene with rate γ2, then the expected B1 gene frequency in adult individuals is
[40]. The rate of forward mutation B1→B2 is sometimes greater than the rate of reverse mutation B2→B1 [30]. I considered stabilizing the selection process and heterozygote intermediate (no dominance) gene action. The coefficient of selection for B1B1, B1B2 and B2B2 genotypes is α1 = 1- w1, α2 = 1- w2 and α3 = 1- w3, respectively. Fitness is only related to the genotypes of individuals. It was assumed that θ = max (|α1|, |α2|, |α3|, γ1, and γ2) such that 0 ˂ θ ˂ ˂ 1 was a small nonrandom variable since α1 and γ1 belonged to the systematic genetic factors [43].
It is supposed that the birth rate (b) is influenced by fitness, coefficient of selection, the size of the population, and mutation. Without considering the fitness, Moran [41] and DUnham [42] suggested a formula to calculate the birth rate. According to Eq. (1), their formula is corrected as
where b is the birth rate of the population with the B1 allele. If the size of the population N is large, then the mutation rates are trivial. Thus
, , such that
.
Environment model development
For simplicity, I considered the environmental random variables responsible for the gene frequency fluctuation and producing random white noise. This random white noise affects the transmission of genes from parents to offspring in the population [36,37]. Therefore, Z = Z(t,x) is standard Brownian motion (the Wiener process) and
or
is the time-space derivatives of the Brownian sheet determined as a generalized random parameter in (ζ)* which is the space of Hida distributions [32,44,45]. For
, (ζ)* and (ζ) are the spaces of Hida and Kondratiev test fUnctions, respectively [34]. In this paper, λi (t,x) is equal to the time and space white noise or polynomial noise, i.e. λi (t,x) =
.
Main Theorem (Theorem 1)
In the Wright-Fisher genetic model, let the random variable
be the frequency of the B1 gene. If the random habitat where alleles spread out changes quickly, then in genes with additive effects, the diffusion approximation
satisfies
where
considers white noise.
Lemma 1
In the Wright-Fisher genetic mode, let
= vn be the frequency of the B1 genes in the group of adults at generation n. If N
N→∞ and N
→0, then →ƸnN where ƸnN = v(t) is the frequency of the B1 genes in the Haldane genetic model.
Further details can be obtained from Norman [37].
Modeling Haldane gene frequency via the diffusion equation
Let’s investigate the modeling of Haldane gene frequency ƸnN = v(t) based on the one-dimensional diffusion equation in a fixed and homogeneous environment.
Lemma 2
If v = v(t) is the frequency of the B1 gene in Haldane genetic model and the population spreads out steadily in a stable environment, then the B1 gene frequency, i.e. v = v(t,x) is the solution of vt = vxx + f(v).
Further details can be obtained from Aronson & Weinberger [46].
Lemma 3
If f(v) specifies as , then in genes with additive effects, f(v) = v(1-v) = v-v2.
Further details can be obtained from Aronson & Weinberger [46].
We have
, f(0) = f(1) = 0 , f´(0) > 0 and f(v) > 0. So, if the diffusion occurs in the case of additive effects of the genes, then for the fixed habitat the frequency of the gene in the Haldane genetic model is the solution of vt = vxx + v – v2 that is Fisher or KPP equation. Also, refer to lemmas 2 and 3. The complete form of the KPP equation is [32]
Lemma 4
If the random processes of the habitat change quickly and Uniformly in time and one-dimensional space, then the sampling variations coincided with the vector of variables that define the habitat will be very similar and in reality identical to the time and space white noise.
Further details can be obtained from Benth, Deck & Potthoff [34]; Lee [45]; White [36], and Norman [37].
Here, the model of population genetics with white noise calculus is reformulated. Therefore:
Lemma 5
If
is a continuous fUnction that retains value in [0,1] as for given constant c > 0,
(P1)
=1 for x < -c
(P2) =0 for x > c
then, there is an inevitable measurable solution v = v(t,x) in a σ-field Ʊn , 0 ≤ v(t,x) ≤ 1 that satisfies the next stochastic partial differential equation
Further details can be obtained from Shiga [47] and Mueller [48].
Therefore, based on the lemmas 1 to 5, the theorem 1 confirms.
Software
Maple software ver.18.01 (Maplesoft, a division of Waterloo Maple Inc. 1981-2014); Wolfram Mathematica software ver.11.0.0.0 (Wolfram Research Inc. 1988-2016) and MATLAB software ver.R2017a 9.2.0.538062 (The MathWorks, Inc. 1984-2017) was used to solve the equations and develop the graphs. MathType software ver.7.4.2.480 (WIRIS America, Design Science, Inc. 1990-2019) was also applied for typing the equations and formulae.
Results
Analysis of the mathematical model
I showed that the mathematical model in theorem 1 belongs to a general model, and explained how to analyze the model. It is emphasized that the modeling of genetic phenomena by the Cauchy problem has possibly not a usual global solution, thus most of the time, the qualitative behavior of the solutions is considered. Some researchers use modern techniques to acquire the statistical information about the behaviors of the genetic systems at a determined time and space since there is not a satisfying existence and Uniqueness theory for the solutions of the Cauchy problem and the stochastic partial differential equations [20,32,33].
General class of the mathematical model in theorem 1
The general class of the mathematical model in theorem 1 is explained as the next category of Cauchy problems
Where Q is a second-order differential operator on
, is a noise component, E, and J are probably nonlinear fUnctions of the solution
and denotes the Wick product [34]. Eq. (4) appears in Mathematical Physics [49]. The solutions of Eq. (4) are generalized random variables in the spaces
relying on the time and space, hence
.
For and , the space defines as the projective limit of the Hilbert spaces . Also, for , calls the spaces of Kondratiev distributions. The Wick product of two elements defines [34]
Where in Eq. (5), is -transformation and is the smooth test fUnction of time and space [34,38]. The generalizations of ItÔ integral are defined as the integrals of Wick products of random parameters and noise components. The regularity of test fUnctions performs an important duty in white noise analysis, so Cauchy problems in Eq. (4) bring to the fixed-point problems inappropriately formed Banach spaces[44,45].
In Eq. (4), Q defines as
Where,
, and d ≥ 1. The fUnctions ,
and are continuous on DT such that they satisfy a consistent HÖlder condition in , steadily in .
Now, consider the below stochastic Cauchy problem
Where, E and J are mappings from into for some , and , respectively (Benth, Deck, & Potthoff, 1997). In Eq. (7), λ(t,x) defines as a d-vector of time and space noise, so . Furthermore,
Where, in Eq. (9),
and λi are the time-space white noise. The noise component λ(t,x) in Eq. (7) can separate into two groups known as polynomial and non-polynomial noises [34].
Let’s show the solution of Eq. (7) as a fixed-point integral equation, i.e.
Where, , and g(t,x; s,y) is the fUndamental solution of the heat equation. Refer to Benth, Deck, and Potthoff (1997) [34] to study the precise conditions on , E, J, and .
The next equation is a special case of the Eq. (7) in one-dimension [48]:
The integral equation below is the solution of Eq. (11)
Where, is the basic solution of the heat equation on . The final integral in Eq. (12) defines through Walsh's [50] theory of integrals and martingale measures.
If , i.e., if there are no mutations and no environmental selections such that wi = 1, thus in lemma 3, and the Eq. (3) in lemma 5 reduce to . Here we have
Also, the solution of Eq. (3) in Lemma 5 based on the integral Eq. (12) is
Since the Eq. (3) in Lemma 5 is of the reaction-diffusion equations, in the Eq. (9) we have [34]; therefore, in the Eq. (3), . Now, the final integral on the right-hand side of Eq. (13) is denoted by [32], thus
Where .
In Eq. (3), is replaced by a fUnction to be easier to analyze. So, let satisfy
Note that may exceed 1, so is needed. An integral equation is derived in the same way as the integral Eq. (12) has derived
Where
There are a pair of solutions to Eqs. (3) and (16) such that, inevitably for all , and [32]. In population genetics, the study of the behavior of v(t,x) for large values of x reveals that the state space can enlarge enough to include finite sets of paths defined up to time t. As the result, the small sets of alleles omit quickly. In other words, if v(t,x) is far from 1, then v(t,x) dx behaves like a super-Brownian motion and small sets of alleles omit in a finite time. Since some of the mutations are fatal, the frequency of alleles that have undergone the fatal mutations decreases quickly for large x. It is reminded that super-Brownian motion has a density, satisfying in one dimension [22,33,49], that is similar to the Eq.)3( when v is small.
Stationary situation
The stationary phase appears in the stochastic partial differential equations that use to study population genetics and statistical mechanics. In population genetics, the traveling wave solutions use to study the propagation velocity of perturbation in one-dimensional diffusion equations in the equilibrium. Therefore, in Eq.(3)
, is a traveling wave solution [46]. Since v is a random variable with Markov property, is a random traveling wave for Eq. )3(.
The model shows that if v is close to 1, then the perturbation forms, i.e., v(t,x) = 1 are not stable. If the process starts by (condition P1 in lemma 5), then some of the alleles will transfer to the next generation but some others will change to the mutant alleles and therefore will omit. But if , then the frequency of the B1 allele in the intermediate region with finite length tends to the stationary situation and the blob will spread with a non-random limiting speed. This case is similar to the hair-trigger effect [46] .
According to the conditions defined , the stationary phase occurs, but for example, if , then the stationary situation does not happen. Higgs [51] studied the multi-locus diploid genetics models with epistatic interactions in sexual, parthenogenetic, and selfing populations including the fitness landscape without diffusion and white noise. As a result, he showed that the stationary distributions occurred in his genetic model.
Shiga’s stepping stone model
Shiga (1988) introduced the stepping stone models in population genetics as a dual process for
This is a system of branching Brownian motion in which the particles coalesce at a Poisson rate based on the local time between pairs of particles. In this case, Poisson white noise can be created (Shiga, 1988). The Eqs. )18( are analogous to our Eq. (3) in Lemma 5.
Haldane genetic model with no spatial spreading of the population
Suppose that the spatial spreading of the population in the Haldane genetic model is trivial, i.e., in Eq. (3). Therefore, in this system of interacting species in a random habitat, the stochastic process is governed by the stochastic differential equation as follows
where v = v(t) is measurable concerning a σ-field ƱnN and satisfies the stochastic differential Eq. )19( in which v is the diffusion approximation of frequency of the B1 allele, and Zt is Gaussian white noise. Here L(v) is approximated by . Also, θ multiplied by the covariance of Gaussian white noise is called the effective noise strength. Eq. )19( is similar to the systems of interacting species in a stochastic habitat presented by White (1977). It is supposed that the rates of environmental changes are quick. But if , then the effective noise strength is small in this model. In a such case, the results are supported by the assumption of tiny white noise. Therefore, Eq. )19( is a stochastic model to evaluate the frequency of B1 allele under additive gene effect or heterozygote intermediate with white noise but without the spatial spreading of the population. This case was proposed by Falconer and MacKay (1996) but without white noise analysis.
Discussion
Comparisons of the equations
a) term in Eq. (11) is the particular case of in Eqs. (4) and (7) where Q has defined in Eq. (6).
b) h(v) term in Eq. (11) is the especial case of in Eqs. (4) and (7).
c) term in Eq. (11) is the special case of , and
in Eqs. (4) and )7( respectively, where the complete equations have presented in Eqs. )8( and )9(.
d) Based on a, b, and c above, the Eq.(11) are the particular case and one-dimensional form of Eqs. (4) and )7(.
e) Eq. )12( is the special case and one-dimensional form of Eq. )10(.
f) The main equations of this paper, i.e. Eq. )3( in lemma 5 are the special case of Eq. (11) where h(v) = v-v2, and
.
Biological meaning and implications of the equations
I showed that if the environmental random processes in the Haldane genetic model changed quickly and smoothly, then in the case of additive gene effects, the diffusion approximation of the allele frequencies in the birth and death processes could be modeled and analyzed by a stochastic partial differential equation, i.e., Eq. )3( in Lemma 5, where the solution is presented in Eq. )13(. Norman [37] derived a Gaussian diffusion process under the selection, random genetic drift, and mutation conditions with large N that fulfilled the Wright-Fisher genetic model and identified with Haldane,s gene frequencies. Almost simultaneously, McKean [33] had shown the application of Brownian motion to these equations and genetic models, including a proof of the theorem of Kolmogorov-Petrovskii-PiskUnov. Also, Mueller and Sowers [32] studied the stochastic partial differential equations and their applications to genetic models including genes with additive effects.
By replacing with a fUnction to be simple to analyze, Eq. )16( was derived. These implied that if v(t,x) behaved like a super-Brownian motion, and if the fatal mutations took place, then for larger values of x, v was far away from 1, and a tiny group of alleles disappeared quickly. Also, if v(t,x) is close to 1, then v is not stable,
and the frequency of the B1 allele in the intermediate region tended to the stationary situation according to the stabilizing selection conditions
. Jensen [14] solved a partial differential equation for additive viabilities for heterozygote genotypes under the selection pressure. But in the current work, if , then the stationary situation did not take place. Therefore, the stationary state of the frequency of the B1 allele in the case of heterozygote intermediate was a result of the stabilizing selection. Nagylaki [15] studied the evolution of a monoecious, diploid, diallelic locus population under some conditions without dominance effects and obtained a diffusion problem for the gene frequency and its correlation with the environment. For a mutant alone, Bürger and Ewens [16] confirmed the diffusion estimate for the fixation probability. Also, exact equations were derived by Ethier and Nagylaki [38] for the stationary distributions with soft linkage.
I showed that with no mutations and no habitat selections, the equation will be as follow
, then , and if wi = 1, then . Therefore
I solved the above system using Maple 18.01 with pdsolve and pdetest commands and I obtained , after applying the initial condition, , and some mathematical calculus. Therefore, the gene frequency distribution of the B1 allele was
Therefore, the 3D plot for the above gene frequency distribution was presented in Figure 2:
Figure 2: Plotting Eq. (20). Commands and parameters in Maple 18.01: plot3d(v, x=-10..10, t=0..10).
In Theorem 1, if , i.e., no white noise, then for the fixed habitat, the frequency of the gene in the Haldane genetic model will be estimated by solving vt = vxx + f(v). Therefore, the complete form of the KPP or Fisher equation [32] was derived for genes with additive effects in which f(v) = v(1-v) = v-v2 (See Lemmas 2 and 3). Finally, I derived the following deterministic form
The above system was again solved by Maple 18.01 with pdsolve and pdetest commands and after applying the initial condition, , and mathematical calculus, the following equation was obtained
The 3D plot for the above gene frequency distribution of the B1 allele is presented in Figure 3:
Figure 3: Plotting Eq. (21). Commands and parameters in Maple 18.01: plot3d(v, x=-10..10, t=0..10).
Wright [6,8] considered a population under mutation and selection pressure conditions and derived a formula for an equilibrium distribution that emerged from the random fixation of the genes; and with no selection pressure, the average frequency of heterozygote genotypes was correlated to the size of the population.
Thus, the traits that tolerate the stabilizing selection presented a different structure, namely, there was no dominance effect. In this case, if the dominance effect existed, it might be ambidirectional i.e., in different directions [40] although the epistatic effect did not mainly take place. The above system of partial differential equations belonged to a special case of random Cauchy problem in Eq. )7( where the solution at a fixed point integral equation was shown in Eq. )10(. Beforehand, Maruyama [5] had acquired the backward and forward Kolmogorov equations to study the frequency of genes and showed that the stochastically converting gene frequency to a random walk was not dependent on the geographical construction of the population. Dakua and Sahambi [52] presented a method using a heat equation with a variable threshold technique for seed selection in random walk-based image segmentation. On the other hand, Dakua and Sahambi [53] used a random walk approach for automatic left ventricular contour extraction from cardiac magnetic resonance images. To extract the blood pool boUndary or endocardium, Dakua and Sahambi (2014) used a random walk model. Besides, Dakua [54] presented a semi-automatic algorithm that utilized the noise for enhancing the contrast of low contrast input magnetic resonance images followed by a new graph cut method to reconstruct the surface of the left ventricle. Also, a semi-automatic graph-based approach was used for image segmentation by Dakua [55]. Illner and Wick [38] studied the statistical and measure-valued solutions of differential equations with non-Uniquely solvable Cauchy problems describing super-Brownian motion. He showed that the classical solution theory was a generalized statistical solution idea.
Various types of gene actions were a result of diverse selective forces. Haldane [4], Fisher [17,18], Wright [8], and Kimura [9] declared that any progress or defeat of a mutant gene was related to both chance and selective forces. Therefore, the emergence of the dominance and epistatic effects was because of the directional selection acting on a trait, and the stabilizing selection highly enforced the additive variation in the heterozygote intermediate case. Schneider, Baptestini, and deAguiar [11] studied the dominance and codominance of diploid genomes and explained that their neutral speciation models estimated the same frequency distributions. The stepping stone model as a system of branching Brownian motion with a Poisson white noise was defined in Eq. )18( that was a dual process and had some similarities with the theorem 1.
Eq. )19( (a special case of Eq. )3() indicates a model to study the diffusion approximation of frequency of the B1 gene in Falconer and MacKay's [43] equation in the case of heterozygote intermediate with white noise. This case is the Haldane genetic model with no spatial spreading of the population in which the effective noise strength is defined as Malliavin [35]. If we rewrite Eq. )19( as
Then according to one-dimensional Itˆo diffusion processes and Feynman-Kac theorem, is called the drift fUnction which is deterministic, and is called the stochastic diffusion fUnction [56]. Here, the drift and diffusion coefficients are nonlinear. Eq. )19( is arisen in biology and especially in population genetics (Ewens, 2012). If in Eq. )19(, i.e., if there were no stochastic processes, then the deterministic form of Eq. )19( would be . Therefore, it was solved by Maple 18.01 with the dsolve command. After applying the initial condition, , and mathematical calculus we had
Here, the 2D plot for the B1 allele frequency distribution, i.e., Falconer and MacKay's [43] equation, was presented in Figure 4:
Figure 4: Plotting Eq. (23). Command and parameter in Maple 18.01: plot(v, t=0..10).
understanding of qualitative behavior of the solutions and quantitative solutions of stochastic partial differential equations are the most demanding aims of their mathematics. But, as explained before, the qualitative behavior of solutions of the stochastic partial differential equations had usually been considered. Eq. (19) (or its equivalent Eq. (22)) as an Itô process, is solved and sample paths of this stochastic differential equation are simulated through Wolfram Mathematica 11.0.0.0 using the Euler-Maruyama solver. For example, only the diffusion approximation of v(t) based on two hypothetical stabilizing selection conditions on θ (θ = 0.25 and θ = 0.4), for the initial condition, , are shown in Figure 5.
Figure 5: Plotting the sample paths of Eq. (19) (or Eq. (22)), only for two hypothetical conditions on θ. Command and parameters in Wolfram Mathematica 11.0.0.0: proc = ItoProcess, t = 0 to10, Minimum increment: 0.01, Resampling: Linear interpolation.
If θ → 0, i.e. if the rates of mutation and selection become very small, then the model would be more deterministic and predictable. On the other hand, if θ → 1, i.e. if the rates of mutation and selection become large, then the model would be more stochastic, and more fluctuations occurred because of the strong effective noise strength. In this case, the stationary situation did not take place. Dakua et al. [57] denoised image sequences modeled by Brownian motion of particles placed in a double-well potential system.
I developed a MATLAB R2017a 9.2.0.538062 program to numerically solve the equation of the main Theorem (Theorem 1) of this work. Therefore, we have
In Eq. (24), , drift fUnction is defined as , f(v) > 0, f(0) = f(1) = 0, f´(0) > 0, diffusion fUnction is defined as , and is considered as Wiener process [58].
Using the MATLAB software, Eq. (24) can be numerically solved for any values of t, x, and θ. Just as an example, if θ = 0.25, t = 1 and n = 1000 then we have some outputs for Eq. (24) like 3.24932908120224e-06, 6.49545146689758e-06, 9.73516362620046e-06, 1.29652683543448e-05 and 1.61825779279393e-05, and the related sample paths is shown in Figure 6. Some special cases of Eq. (24) have applications in supper-Brownian motion studies too [58].
Figure 6: Plotting the sample paths of Eq. (24). Commands and parameters in MATLAB R2017a 9.2.0.538062: only for θ = 0.25, t = 1 and resampling (n) = 1000.
Unsolved issues
More researches and model simulations are necessary to perform in order to complete the subject discussed in the present study as well as the models with small population size, asexual mating, nonrandom mating, polyploid population, migration, tiny birth rate, and also moderate and high mutation rates. The different degrees of dominance effects and epistatic effects have to consider along with the polyallelic loci, and polygenic inheritance. On the other hand, since the fitness of the genotypes is not usually fixed and correlates with the other variables, it is valuable to study the mathematical modeling of the fitness in more detail accompanied by evaluating the heterogeneous environments. Since the diffusion approximation of the gene frequency based on the Haldane genetic model under the conditions of heterozygote intermediate (additive gene effects) in a birth and death process in a random environment was studied in this work, changing the conditions in this model leads to more new situations that need to be studied in-depth. Therefore, more complexities will be probably the potential limitations of the current research.
Perspectives of the higher-level research
under natural conditions, the selection increases the fitness of the species and acts to adapt the species to new environments in the form of directional selection. In populations and commUnities with sexual reproduction and intergenomic epistasis, the traits such as fertility and viability affect the fitness of the animals and plants. Therefore, the selection will be directional about the maximum expression of the suitable genes.
It is important to consider the higher level of studies, i.e. the mathematical modeling of gene interactions among species in commUnity genetics [59]. Thus, the classic genotype × environment interaction (G×E) model must assess the higher-order interactions as the genotype × genotype × environment interaction (G×G×E) models, i.e., the interspecific interactions [60]. Since the interrelationships between the specific genes and ecosystems are not clear, it is crucial to mathematically analyze the related problems in population and commUnity genetics. Also, it is necessary to consider the modern marker technologies for DNA-RNA sequencing, comparative genomics, and molecular eco-evolutionary genetics [61].
In the present study, I aimed mathematical modeling and analysis of the Haldane genetic model under Brownian motion using a stochastic differential equation, but there are rich and attractive problems in eco-evolutionary commUnity genetics to investigate the indirect genetic effects via the systems of stochastic partial differential equations and white noise calculus [62-64] in which the phenotype of an organism is part of the habitat of another organism. It may result in the emergence of a fascinating interdisciplinary scientific branch.
Final suggestion
It is proposed that the researchers integrate the predictions of the mathematical modeling of natural populations with the results of experimental designs including practical and empirical laboratory and field studies to increase the accuracy of the models and their outcomes.