Object detection has revolutionized multiple domains by enabling models to jointly classify and localize targets within data. Yet, its potential in genomic sequence analysis remains largely unexplored. Here, we introduce DNA-DETR, an adaptation of the DETR architecture for one-dimensional genomic object detection. Surprisingly, the direct application of object detection to DNA sequences yielded poor performance, even for elements with simple definitions such as Non-B DNA. We found that the widely used one-hot encoding failed to capture key structural features of several Non-B DNA types. To address this limitation, we systematically investigated how different sequence representations, including one-hot encoding, dot matrix, and their combination, affect detection accuracy and model generalization. Our experiments demonstrate that the choice of representation profoundly affects both localization and classification. Notably, the combined representation consistently outperformed single representations, particularly for complex sequence elements. Our findings suggest that there is no universal “one-representation-fits-all” solution in sequence feature learning. Despite the common perception that end-to-end learning diminishes the importance of representation, our results highlight that thoughtful selection of sequence representation remains critical for model design.