这里是运用概率论的一些知识,算遗传、基因相关的东西
joint probability
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
|
def joint_probability(people, one_gene, two_genes, have_trait):
"""
Compute and return a joint probability.
The probability returned should be the probability that
* everyone in set `one_gene` has one copy of the gene, and
* everyone in set `two_genes` has two copies of the gene, and
* everyone not in `one_gene` or `two_gene` does not have the gene, and
* everyone in set `have_trait` has the trait, and
* everyone not in set` have_trait` does not have the trait.
"""
total_prob = 1
for person in people: #遍历所有的人
if person in one_gene: #根据 gene 的数量进行提取并赋值
gene_count = 1
elif person in two_genes:
gene_count = 2
else:
gene_count = 0
trait = person in have_trait #提取性状
mo = people[person]["mother"] #提取父母,注意这里提取信息的方式
fa = people[person]["father"]
if not mo and not fa: #没有父母信息,则采用随机的基因概率
prob = PROBS["gene"][gene_count]
else:
mo_prob = inherit_prob(mo,one_gene, two_genes) #这里自己写了一个辅助函数,用于计算从单亲那里遗传到基因的概率
fa_prob = inherit_prob(fa,one_gene, two_genes)
if gene_count == 2: #两条基因,全部来自父母
prob = mo_prob * fa_prob
elif gene_count == 1:#一条,则需要讨论到底谁有谁没有
prob = (1 - mo_prob) * (fa_prob) + (mo_prob) * (1 - fa_prob)
else: #没有,则双方都没给基因
prob = (1 - mo_prob) * (1 - fa_prob)
trait_prob = PROBS["trait"][gene_count][trait] #从概率字典中找到性状、基因条数对应的概率(大概可以叫表现概率)
final_prob = trait_prob * prob #与遗传概率相乘,得到最终的概率
total_prob *= final_prob #更新总概率
return total_prob
def inherit_prob(parent,one_gene, two_genes):
if parent in one_gene: #如果父母只有一条,则一半的概率
prob = 0.5
elif parent in two_genes: #有两条,则按照概率
prob = 1 - PROBS["mutation"]
else: #父母没相应的基因,因而只能通过突变获得基因
prob = PROBS["mutation"]
return prob
|
我要被难死了,真的好难,或者是我没太学懂吧,配合 copilot 慢慢啃下来的
需要注意的是:
- 搞清楚辅助函数的含义,这里需要的是计算从父母那里得到基因的概率,所以不用再在内部分类讨论了
for person in people
这里相当于是遍历 people 字典中 person 的 key,因此不能把 person 当做一个字典来访问,而是使用 people[person]
来访问
- 这里算的是整个 people 一家子出现这种表现性状的概率,因此在每次 person 计算完后需要更新总概率
update
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
def update(probabilities, one_gene, two_genes, have_trait, p):
"""
Add to `probabilities` a new joint probability `p`.
Each person should have their "gene" and "trait" distributions updated.
Which value for each distribution is updated depends on whether
the person is in `have_gene` and `have_trait`, respectively.
"""
for person in probabilities:
if person in one_gene:
probabilities[person]["gene"][1] += p
elif person in two_genes:
probabilities[person]["gene"][2] += p
else:
probabilities[person]["gene"][0] += p
if person in have_trait:
probabilities[person]["trait"][True] += p
elif person not in have_trait:
probabilities[person]["trait"][False] += p
|
这个简单多了,就是把 p 这个算出来的概率填到 probability 字典中对应的项中,分类讨论即可
normalize
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
def normalize(probabilities):
"""
Update `probabilities` such that each probability distribution
is normalized (i.e., sums to 1, with relative proportions the same).
"""
for person in probabilities:
zero = probabilities[person]["gene"][0]
one = probabilities[person]["gene"][1]
two = probabilities[person]["gene"][2]
ratio = 1 / (zero + one + two)
probabilities[person]["gene"][0] = zero * ratio
probabilities[person]["gene"][1] = one * ratio
probabilities[person]["gene"][2] = two * ratio
true = probabilities[person]["trait"][True]
false = probabilities[person]["trait"][False]
ratio = 1 / (true + false)
probabilities[person]["trait"][True] = true * ratio
probabilities[person]["trait"][False] = false * ratio
|
这里做归一化,需要把某一项下对应的所有概率加起来,看看需要扩多少倍,然后更改原有的概率
这里我做麻烦了,加和可以通过 sum 来实现
至此,Heredity 项目完结
