Project - Heredity

这里是运用概率论的一些知识,算遗传、基因相关的东西

joint probability

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
def joint_probability(people, one_gene, two_genes, have_trait):
    """
    Compute and return a joint probability.

    The probability returned should be the probability that
        * everyone in set `one_gene` has one copy of the gene, and
        * everyone in set `two_genes` has two copies of the gene, and
        * everyone not in `one_gene` or `two_gene` does not have the gene, and
        * everyone in set `have_trait` has the trait, and
        * everyone not in set` have_trait` does not have the trait.
    """
    total_prob = 1
    for person in people:   #遍历所有的人
        if person in one_gene:  #根据 gene 的数量进行提取并赋值
            gene_count = 1
        elif person in two_genes:
            gene_count = 2
        else:
            gene_count = 0
        trait = person in have_trait  #提取性状
        mo = people[person]["mother"]  #提取父母,注意这里提取信息的方式
        fa = people[person]["father"]
        if not mo and not fa:  #没有父母信息,则采用随机的基因概率
            prob = PROBS["gene"][gene_count]
        else:
            mo_prob = inherit_prob(mo,one_gene, two_genes)  #这里自己写了一个辅助函数,用于计算从单亲那里遗传到基因的概率
            fa_prob = inherit_prob(fa,one_gene, two_genes)
        
            if gene_count == 2:  #两条基因,全部来自父母
                prob = mo_prob * fa_prob
            elif gene_count == 1:#一条,则需要讨论到底谁有谁没有
                prob = (1 - mo_prob) * (fa_prob) + (mo_prob) * (1 - fa_prob)
            else:  #没有,则双方都没给基因
                prob = (1 - mo_prob) * (1 - fa_prob)
        
        trait_prob = PROBS["trait"][gene_count][trait]  #从概率字典中找到性状、基因条数对应的概率(大概可以叫表现概率)
        
        final_prob = trait_prob * prob  #与遗传概率相乘,得到最终的概率
        total_prob *= final_prob  #更新总概率
        
    return total_prob
        
def inherit_prob(parent,one_gene, two_genes):
    if parent in one_gene: #如果父母只有一条,则一半的概率
        prob = 0.5
    elif parent in two_genes: #有两条,则按照概率
        prob = 1 - PROBS["mutation"]
    else: #父母没相应的基因,因而只能通过突变获得基因
        prob = PROBS["mutation"]
    return prob

我要被难死了,真的好难,或者是我没太学懂吧,配合 copilot 慢慢啃下来的

需要注意的是:

  1. 搞清楚辅助函数的含义,这里需要的是计算从父母那里得到基因的概率,所以不用再在内部分类讨论了
  2. for person in people 这里相当于是遍历 people 字典中 person 的 key,因此不能把 person 当做一个字典来访问,而是使用 people[person]来访问
  3. 这里算的是整个 people 一家子出现这种表现性状的概率,因此在每次 person 计算完后需要更新总概率

update

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def update(probabilities, one_gene, two_genes, have_trait, p):
    """
    Add to `probabilities` a new joint probability `p`.
    Each person should have their "gene" and "trait" distributions updated.
    Which value for each distribution is updated depends on whether
    the person is in `have_gene` and `have_trait`, respectively.
    """
    for person in probabilities:
        if person in one_gene:
            probabilities[person]["gene"][1] += p
        elif person in two_genes:
            probabilities[person]["gene"][2] += p
        else:
            probabilities[person]["gene"][0] += p
        if person in have_trait:
            probabilities[person]["trait"][True] += p
        elif person not in have_trait:
            probabilities[person]["trait"][False] += p

这个简单多了,就是把 p 这个算出来的概率填到 probability 字典中对应的项中,分类讨论即可

normalize

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def normalize(probabilities):
    """
    Update `probabilities` such that each probability distribution
    is normalized (i.e., sums to 1, with relative proportions the same).
    """
    for person in probabilities:
        zero = probabilities[person]["gene"][0]
        one = probabilities[person]["gene"][1]
        two = probabilities[person]["gene"][2]
        
        ratio = 1 / (zero + one + two)
        
        probabilities[person]["gene"][0] = zero * ratio
        probabilities[person]["gene"][1] = one * ratio
        probabilities[person]["gene"][2] = two * ratio
        
        true = probabilities[person]["trait"][True]
        false = probabilities[person]["trait"][False]
        
        ratio = 1 / (true + false)
        
        probabilities[person]["trait"][True] = true * ratio
        probabilities[person]["trait"][False] = false * ratio

这里做归一化,需要把某一项下对应的所有概率加起来,看看需要扩多少倍,然后更改原有的概率

这里我做麻烦了,加和可以通过 sum 来实现

至此,Heredity 项目完结
../../../source/Pasted image 20250415180556.png

Licensed under CC BY-NC-SA 4.0
最后更新于 May 23, 2025 02:30 UTC
comments powered by Disqus
使用 Hugo 构建
主题 StackJimmy 设计