Authorship analysis has traditionally focused on lexical and stylistic cues within text, while higher-level narrative structure remains underexplored, particularly for low-resource languages such as Urdu. This work proposes a graph-based framework that models Urdu novels as character interaction networks to examine whether authorial style can be inferred from narrative structure alone. Each novel is represented as a graph where nodes correspond to characters and edges denote their co-occurrence within narrative proximity. We systematically compare multiple graph representations, including global structural features, node-level semantic summaries, unsupervised graph embeddings, and supervised graph neural networks. Experiments on a dataset of 52 Urdu novels written by seven authors show that learned graph representations substantially outperform hand-crafted and unsupervised baselines, achieving up to 0.857 accuracy under a strict author-aware evaluation protocol.
翻译:传统的作者分析主要关注文本中的词汇与风格线索,而更高层次的叙事结构仍未被充分探索,尤其对于乌尔都语这类低资源语言。本研究提出一种基于图的框架,将乌尔都语小说建模为角色交互网络,以探究是否仅从叙事结构即可推断作者风格。每部小说被表示为一张图,其中节点对应角色,边表示角色在叙事邻近范围内的共现关系。我们系统比较了多种图表示方法,包括全局结构特征、节点级语义摘要、无监督图嵌入以及有监督图神经网络。在包含七位作者撰写的52部乌尔都语小说的数据集上进行的实验表明,学习得到的图表示显著优于手工构建和无监督基线方法,在严格的作者感知评估协议下准确率最高可达0.857。