评估指标

评估指标模块。

提供影响力最大化算法的多种评估指标函数。

模块包含: - ranking_metrics: 排名稳定性评估指标 - influence_metrics: 影响力传播评估指标 - seed_quality_metrics: 种子节点质量评估指标 - network_metrics: 网络结构评估指标

pynetim.evaluation.average_shortest_distance(graph, seeds, use_weight=False)[source]

计算种子节点之间的平均最短距离。

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。
use_weight (bool) – 是否使用边权重计算距离，默认 False（使用跳数）。

Returns:

平均最短距离。如果种子节点之间不可达，返回 -1。

Return type:

float

Example

>>> from pynetim.evaluation import average_shortest_distance
>>> # 基于跳数
>>> avg_dist = average_shortest_distance(graph, seeds)
>>> # 基于权重
>>> avg_dist = average_shortest_distance(graph, seeds, use_weight=True)

pynetim.evaluation.clustering_coefficient(graph, seeds)[source]

计算种子节点的平均聚类系数。

聚类系数衡量节点邻居之间的连接密度。

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。

Returns:

平均聚类系数，范围 [0, 1]。

Return type:

float

Example

>>> from pynetim.evaluation import clustering_coefficient
>>> cc = clustering_coefficient(graph, seeds)
>>> print(f"Clustering coefficient: {cc:.4f}")

pynetim.evaluation.degree_distribution(graph, seeds)[source]

计算种子节点的度分布。

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。

Returns:

度值到节点数量的映射。

Return type:

Dict[int, int]

Example

>>> from pynetim.evaluation import degree_distribution
>>> dist = degree_distribution(graph, seeds)
>>> print(f"Degree 5: {dist.get(5, 0)} nodes")

pynetim.evaluation.degree_statistics(graph, seeds)[source]

计算种子节点的度统计信息。

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。

Returns:

包含以下统计量：

mean_degree: 平均度
max_degree: 最大度
min_degree: 最小度
std_degree: 度标准差

Return type:

Dict[str, float]

Example

>>> from pynetim.evaluation import degree_statistics
>>> stats = degree_statistics(graph, seeds)
>>> print(f"Mean degree: {stats['mean_degree']:.2f}")

pynetim.evaluation.distribution_entropy(graph, seeds)[source]

计算种子节点在度分布上的熵。

熵值越高表示种子节点分布越均匀。

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。

Returns:

分布熵，范围 [0, 1]。

1.0 表示分布完全均匀
0.0 表示分布完全集中

Return type:

float

Example

>>> from pynetim.evaluation import distribution_entropy
>>> entropy = distribution_entropy(graph, seeds)
>>> print(f"Distribution entropy: {entropy:.4f}")

pynetim.evaluation.kendall_tau(ranking1, ranking2, variant='b')[source]

计算两个排名之间的肯德尔系数。

Kendall’s Tau 用于衡量两个排名序列的相关性，适用于评估不同算法得到的种子节点排名的一致性。

Parameters:

ranking1 (List[int] | ndarray) – 第一个排名序列（节点ID列表）。
ranking2 (List[int] | ndarray) – 第二个排名序列（节点ID列表）。
variant (str) – Kendall’s Tau 变体，可选 ‘b’ 或 ‘c’。默认 ‘b’，可处理并列排名。

Returns:

(tau系数, p值)

tau系数范围 [-1, 1]，1表示完全一致，-1表示完全相反
p值表示统计显著性

Return type:

Tuple[float, float]

Raises:

ImportError – 如果 scipy 未安装。
ValueError – 如果输入长度不一致。

References

Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30(1/2), 81-93.

Example

>>> from pynetim.evaluation import kendall_tau
>>> seeds_algo1 = [0, 1, 2, 3, 4]
>>> seeds_algo2 = [0, 2, 1, 3, 4]
>>> tau, p = kendall_tau(seeds_algo1, seeds_algo2)
>>> print(f"Kendall's Tau: {tau:.4f}, p-value: {p:.4f}")

pynetim.evaluation.local_clustering(graph, seeds)[source]

计算种子节点的局部聚类系数。

评估种子节点周围网络的紧密程度。

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。

Returns:

平均局部聚类系数，范围 [0, 1]。

Return type:

float

Example

>>> from pynetim.evaluation import local_clustering
>>> clustering = local_clustering(graph, seeds)
>>> print(f"Local clustering: {clustering:.4f}")

pynetim.evaluation.mean_centrality(graph, seeds, centrality_type='degree')[source]

计算种子节点的平均中心性。

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。
centrality_type (str) – 中心性类型，可选： - ‘degree’: 度中心性 - ‘in_degree’: 入度中心性 - ‘out_degree’: 出度中心性

Returns:

平均中心性值。

Return type:

float

Example

>>> from pynetim.evaluation import mean_centrality
>>> centrality = mean_centrality(graph, seeds, centrality_type='degree')

pynetim.evaluation.monotonicity_score(values)[source]

计算节点重要性排序的单调性得分。

单调性指标衡量排序算法能否有效区分不同节点的重要性。如果算法能给每个节点赋予唯一的重要性值，单调性接近1；如果很多节点有相同的重要性值，单调性会降低。

计算公式：

M = (N_unique - 1) / (N_total - 1)

其中：

N_unique: 不同重要性值的数量
N_total: 总节点数
M 的取值范围为 [0, 1]
M → 1 表示排序算法具有良好的区分度
M → 0 表示所有节点的重要性值都相同

Parameters:

values (List[float] | ndarray) – 节点重要性得分序列。

Returns:

单调性得分，范围 [0, 1]。

1.0 表示所有节点的重要性值都唯一（完美区分）
0.0 表示所有节点的重要性值都相同（无区分度）

Return type:

float

References

A novel voting measure for identifying influential nodes in complex networks based on local structure. Scientific Reports, 2025.
复杂网络节点重要性排序算法的单调性评估。

Example

>>> from pynetim.evaluation import monotonicity_score
>>> # 所有值都不同，单调性高
>>> values1 = [1.0, 2.0, 3.0, 4.0, 5.0]
>>> score1 = monotonicity_score(values1)
>>> print(f"Monotonicity: {score1:.4f}")  # 1.0

>>> # 有重复值，单调性降低
>>> values2 = [1.0, 2.0, 2.0, 4.0, 5.0]
>>> score2 = monotonicity_score(values2)
>>> print(f"Monotonicity: {score2:.4f}")  # 0.75

>>> # 所有值相同，单调性为0
>>> values3 = [3.0, 3.0, 3.0, 3.0, 3.0]
>>> score3 = monotonicity_score(values3)
>>> print(f"Monotonicity: {score3:.4f}")  # 0.0

pynetim.evaluation.neighbor_coverage(graph, seeds)[source]

计算种子节点的邻居覆盖率。

邻居覆盖率 = 种子节点的唯一邻居数 / 网络总节点数

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。

Returns:

邻居覆盖率，范围 [0, 1]。

Return type:

float

Example

>>> from pynetim.evaluation import neighbor_coverage
>>> coverage = neighbor_coverage(graph, seeds)
>>> print(f"Neighbor coverage: {coverage:.2%}")

pynetim.evaluation.ranking_distance(ranking1, ranking2, metric='kendall')[source]

计算两个排名之间的距离。

Parameters:

ranking1 (List[int] | ndarray) – 第一个排名序列。
ranking2 (List[int] | ndarray) – 第二个排名序列。
metric (str) – 距离度量方法，可选： - ‘kendall’: Kendall距离（交换次数） - ‘spearman’: Spearman距离（排名差的平方和） - ‘hamming’: Hamming距离（不同位置的个数）

Returns:

排名距离，越小表示排名越相似。

Return type:

float

Raises:

ValueError – 如果 metric 不支持。

Example

>>> from pynetim.evaluation import ranking_distance
>>> ranking1 = [0, 1, 2, 3, 4]
>>> ranking2 = [0, 2, 1, 3, 4]
>>> distance = ranking_distance(ranking1, ranking2, metric='kendall')

pynetim.evaluation.ranking_stability(rankings, method='kendall')[source]

计算多个排名之间的平均稳定性。

评估算法多次运行得到的排名的一致性。

Parameters:

rankings (List[List[int] | ndarray]) – 多个排名序列的列表。
method (str) – 稳定性计算方法，可选： - ‘kendall’: 使用肯德尔系数 - ‘spearman’: 使用斯皮尔曼系数 - ‘overlap’: 使用 Top-K 重叠率

Returns:

平均稳定性得分，范围 [0, 1]。

Return type:

float

Example

>>> from pynetim.evaluation import ranking_stability
>>> rankings = [
...     [0, 1, 2, 3, 4],
...     [0, 2, 1, 3, 4],
...     [0, 1, 2, 4, 3]
... ]
>>> stability = ranking_stability(rankings)

pynetim.evaluation.reachability(graph, seeds)[source]

计算种子节点的可达性。

可达性 = 从种子节点可达的节点数 / 网络总节点数

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。

Returns:

可达性，范围 [0, 1]。

Return type:

float

Example

>>> from pynetim.evaluation import reachability
>>> reach = reachability(graph, seeds)
>>> print(f"Reachability: {reach:.2%}")

pynetim.evaluation.seed_diversity(graph, seeds)[source]

计算种子节点的多样性。

基于种子节点之间的平均距离评估多样性。

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。

Returns:

多样性得分，范围 [0, 1]。

1.0 表示种子节点分布非常分散
0.0 表示种子节点非常集中

Return type:

float

Example

>>> from pynetim.evaluation import seed_diversity
>>> diversity = seed_diversity(graph, seeds)

pynetim.evaluation.seed_overlap(seeds1, seeds2)[source]

计算两组种子节点的重叠率。

Jaccard相似度 = |S1 ∩ S2| / |S1 ∪ S2|

Parameters:

seeds1 (Set[int]) – 第一组种子节点。
seeds2 (Set[int]) – 第二组种子节点。

Returns:

重叠率，范围 [0, 1]。

Return type:

float

Example

>>> from pynetim.evaluation import seed_overlap
>>> overlap = seed_overlap(seeds1, seeds2)
>>> print(f"Overlap: {overlap:.2%}")

pynetim.evaluation.spearman_correlation(ranking1, ranking2)[source]

计算两个排名之间的斯皮尔曼相关系数。

Spearman相关系数评估两个排名的单调关系，适用于评估种子节点排名的相关性。

Parameters:

ranking1 (List[int] | ndarray) – 第一个排名序列。
ranking2 (List[int] | ndarray) – 第二个排名序列。

Returns:

(相关系数, p值)

相关系数范围 [-1, 1]
p值表示统计显著性

Return type:

Tuple[float, float]

Raises:

ImportError – 如果 scipy 未安装。
ValueError – 如果输入长度不一致。

Example

>>> from pynetim.evaluation import spearman_correlation
>>> ranking1 = [1, 2, 3, 4, 5]
>>> ranking2 = [1, 3, 2, 4, 5]
>>> rho, p = spearman_correlation(ranking1, ranking2)

pynetim.evaluation.top_k_accuracy(predicted_seeds, ground_truth_seeds, k=None)[source]

计算 Top-K 准确率。

评估预测的种子节点与真实种子节点的重叠程度。

Parameters:

predicted_seeds (Set[int] | List[int]) – 预测的种子节点（按重要性排序）。
ground_truth_seeds (Set[int] | List[int]) – 真实的种子节点。
k (int | None) – Top-K 的 K 值，默认为 min(len(predicted), len(ground_truth))。

Returns:

Top-K 准确率，范围 [0, 1]。

1.0 表示完全重叠
0.0 表示完全不重叠

Return type:

float

Example

>>> from pynetim.evaluation import top_k_accuracy
>>> predicted = [0, 1, 2, 3, 4]
>>> ground_truth = [0, 2, 1, 5, 6]
>>> acc = top_k_accuracy(predicted, ground_truth, k=3)
>>> print(f"Top-3 accuracy: {acc:.2%}")

pynetim.evaluation.top_k_overlap(ranking1, ranking2, k)[source]

计算两个排名在 Top-K 位置的重叠率。

评估不同算法选择的种子节点集合的重叠程度。

Parameters:

ranking1 (List[int] | Set[int]) – 第一个排名序列（节点ID列表）。
ranking2 (List[int] | Set[int]) – 第二个排名序列（节点ID列表）。
k (int) – Top-K 的 K 值。

Returns:

重叠率，范围 [0, 1]。

1.0 表示完全重叠
0.0 表示完全不重叠

Return type:

float

Example

>>> from pynetim.evaluation import top_k_overlap
>>> ranking1 = [0, 1, 2, 3, 4, 5]
>>> ranking2 = [0, 2, 1, 4, 3, 5]
>>> overlap = top_k_overlap(ranking1, ranking2, k=3)
>>> print(f"Top-3 overlap: {overlap:.4f}")

pynetim.evaluation.weight_statistics(graph, seeds)[source]

计算种子节点相关边的权重统计信息。

Parameters:

graph (IMGraph) – 图对象。
seeds (Set[int]) – 种子节点集合。

Returns:

包含以下统计量：

mean_weight: 平均权重
max_weight: 最大权重
min_weight: 最小权重
total_weight: 总权重

Return type:

Dict[str, float]

Example

>>> from pynetim.evaluation import weight_statistics
>>> stats = weight_statistics(graph, seeds)
>>> print(f"Mean weight: {stats['mean_weight']:.4f}")